Technical note: an R package for fitting sparse neural networks with application in animal breeding.

Neural networks (NNs) have emerged as a new tool for genomic selection (GS) in animal breeding. However, the properties of NN used in GS for the prediction of phenotypic outcomes are not well characterized due to the problem of over-parameterization of NN and difficulties in using whole-genome marker sets as high-dimensional NN input. In this note, we have developed an R package called snnR that finds an optimal sparse structure of a NN by minimizing the square error subject to a penalty on the L1-norm of the parameters (weights and biases), therefore solving the problem of over-parameterization in NN. We have also tested some models fitted in the snnR package to demonstrate their feasibility and effectiveness to be used in several cases as examples. In comparison of snnR to the R package brnn (the Bayesian regularized single layer NNs), with both using the entries of a genotype matrix or a genomic relationship matrix as inputs, snnR has greatly improved the computational efficiency and the prediction ability for the GS in animal breeding because snnR implements a sparse NN with many hidden layers.

[1]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[2]  Stewart Bauck,et al.  Predicting expected progeny difference for marbling score in Angus cattle using artificial neural networks and Bayesian regression models , 2013, Genetics Selection Evolution.

[3]  Ulrich Anders,et al.  Model selection in neural networks , 1999, Neural Networks.

[4]  G. Dounias,et al.  On detecting the optimal structure of a neural network under strong statistical features in errors , 2011 .

[5]  Jennie E. Pryce,et al.  Validation of markers with non-additive effects on milk yield and fertility in Holstein and Jersey cows , 2015, BMC Genetics.

[6]  Danilo Comminiello,et al.  Group sparse regularization for deep neural networks , 2016, Neurocomputing.

[7]  Yu Wang,et al.  Genome-Wide Prediction of Traits with Different Genetic Architecture Through Efficient Variable Selection , 2013, Genetics.

[8]  Vera Kurková,et al.  Kolmogorov's theorem and multilayer neural networks , 1992, Neural Networks.

[9]  D. Gianola,et al.  Reproducing Kernel Hilbert Spaces Regression Methods for Genomic Assisted Prediction of Quantitative Traits , 2008, Genetics.

[10]  J. J. Moré,et al.  Quasi-Newton Methods, Motivation and Theory , 1974 .

[11]  K. Weigel,et al.  Predicting complex quantitative traits with Bayesian neural networks: a case study with Jersey cows and wheat , 2011, BMC Genetics.

[12]  Daniel Gianola,et al.  Application of neural networks with back-propagation to genome-enabled prediction of complex traits in Holstein-Friesian and German Fleckvieh cattle , 2015, Genetics Selection Evolution.

[13]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[14]  Dimitri P. Bertsekas,et al.  Convex Analysis and Optimization , 2003 .

[15]  Philip E. Gill,et al.  Practical optimization , 1981 .

[16]  Andy M. Yip,et al.  A Primal-Dual Active-Set Method for Non-Negativity Constrained Total Variation Deblurring Problems , 2007, IEEE Transactions on Image Processing.

[17]  P Pérez-Rodríguez,et al.  Technical note: An R package for fitting Bayesian regularized neural networks with applications in animal breeding. , 2013, Journal of animal science.

[18]  P. VanRaden,et al.  Efficient methods to compute genomic predictions. , 2008, Journal of dairy science.

[19]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[20]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[21]  Vincent Gripon,et al.  Sparse Neural Networks With Large Learning Diversity , 2011, IEEE Transactions on Neural Networks.

[22]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[23]  J. Nocedal Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[24]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.