Comparison of methods for the implementation of genome-assisted evaluation of Spanish dairy cattle.

The aim of this study was to evaluate methods for genomic evaluation of the Spanish Holstein population as an initial step toward the implementation of routine genomic evaluations. This study provides a description of the population structure of progeny tested bulls in Spain at the genomic level and compares different genomic evaluation methods with regard to accuracy and bias. Two bayesian linear regression models, Bayes-A and Bayesian-LASSO (B-LASSO), as well as a machine learning algorithm, Random-Boosting (R-Boost), and BLUP using a realized genomic relationship matrix (G-BLUP), were compared. Five traits that are currently under selection in the Spanish Holstein population were used: milk yield, fat yield, protein yield, fat percentage, and udder depth. In total, genotypes from 1859 progeny tested bulls were used. The training sets were composed of bulls born before 2005; including 1601 bulls for production and 1574 bulls for type, whereas the testing sets contained 258 and 235 bulls born in 2005 or later for production and type, respectively. Deregressed proofs (DRP) from January 2009 Interbull (Uppsala, Sweden) evaluation were used as the dependent variables for bulls in the training sets, whereas DRP from the December 2011 DRPs Interbull evaluation were used to compare genomic predictions with progeny test results for bulls in the testing set. Genomic predictions were more accurate than traditional pedigree indices for predicting future progeny test results of young bulls. The gain in accuracy, due to inclusion of genomic data varied by trait and ranged from 0.04 to 0.42 Pearson correlation units. Results averaged across traits showed that B-LASSO had the highest accuracy with an advantage of 0.01, 0.03 and 0.03 points in Pearson correlation compared with R-Boost, Bayes-A, and G-BLUP, respectively. The B-LASSO predictions also showed the least bias (0.02, 0.03 and 0.10 SD units less than Bayes-A, R-Boost and G-BLUP, respectively) as measured by mean difference between genomic predictions and progeny test results. The R-Boosting algorithm provided genomic predictions with regression coefficients closer to unity, which is an alternative measure of bias, for 4 out of 5 traits and also resulted in mean squared errors estimates that were 2%, 10%, and 12% smaller than B-LASSO, Bayes-A, and G-BLUP, respectively. The observed prediction accuracy obtained with these methods was within the range of values expected for a population of similar size, suggesting that the prediction method and reference population described herein are appropriate for implementation of routine genome-assisted evaluations in Spanish dairy cattle. R-Boost is a competitive marker regression methodology in terms of predictive ability that can accommodate large data sets.

[1]  B. Hayes,et al.  Accuracy of genomic predictions of residual feed intake and 250-day body weight in growing heifers using 625,000 single nucleotide polymorphism markers. , 2012, Journal of dairy science.

[2]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[3]  Guosheng Su,et al.  A common reference population from four European Holstein populations increases reliability of genomic predictions , 2011, Genetics Selection Evolution.

[4]  P. Lichtner,et al.  The impact of genetic relationship information on genomic breeding values in German Holstein cattle , 2010, Genetics Selection Evolution.

[5]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[6]  F. Seefried,et al.  Impacts of both reference population size and inclusion of a residual polygenic effect on the accuracy of genomic prediction , 2011, Genetics Selection Evolution.

[7]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[8]  T A Cooper,et al.  The genomic evaluation system in the United States: past, present, future. , 2011, Journal of dairy science.

[9]  J. Woolliams,et al.  The Impact of Genetic Architecture on Genome-Wide Evaluation Methods , 2010, Genetics.

[10]  W. G. Hill,et al.  Linkage disequilibrium in finite populations , 1968, Theoretical and Applied Genetics.

[11]  Daniel Gianola,et al.  Additive Genetic Variability and the Bayesian Alphabet , 2009, Genetics.

[12]  N. Yi,et al.  Bayesian LASSO for Quantitative Trait Loci Mapping , 2008, Genetics.

[13]  O. González-Recio,et al.  The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets. , 2013, Journal of dairy science.

[14]  P. VanRaden,et al.  Prediction of unobserved single nucleotide polymorphism genotypes of Jersey cattle using reference panels and population-based imputation algorithms. , 2010, Journal of dairy science.

[15]  I Misztal,et al.  Hot topic: a unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. , 2010, Journal of dairy science.

[16]  M. Goddard,et al.  Invited review: Genomic selection in dairy cattle: progress and challenges. , 2009, Journal of dairy science.

[17]  P. VanRaden,et al.  Differences among methods to validate genomic evaluations for dairy cattle. , 2011, Journal of dairy science.

[18]  O. González-Recio,et al.  Genome-wide prediction of discrete traits using bayesian regressions and machine learning , 2011, Genetics Selection Evolution.

[19]  Paul M VanRaden,et al.  International genomic evaluation methods for dairy cattle , 2010, Genetics Selection Evolution.

[20]  R. Fernando,et al.  Genomic selection in admixed and crossbred populations. , 2010, Journal of animal science.

[21]  Hans-Peter Piepho,et al.  A comparison of random forests, boosting and support vector machines for genomic selection , 2011, BMC proceedings.

[22]  M. Coffey,et al.  Short communication: Characterization of the genome-wide linkage disequilibrium in 2 divergent selection lines of dairy cows. , 2010, Journal of dairy science.

[23]  M. Goddard,et al.  Reliability of Genomic Predictions Across Multiple Populations , 2009, Genetics.

[24]  P. VanRaden,et al.  Selection of single-nucleotide polymorphisms and quality of genotypes used in genomic evaluation of dairy cattle in the United States and Canada. , 2009, Journal of dairy science.

[25]  F. Schenkel,et al.  Extent of linkage disequilibrium in Holstein cattle in North America. , 2008, Journal of Dairy Science.

[26]  P. VanRaden,et al.  Invited review: reliability of genomic predictions for North American Holstein bulls. , 2009, Journal of dairy science.

[27]  D. Gianola,et al.  Reproducing Kernel Hilbert Spaces Regression Methods for Genomic Assisted Prediction of Quantitative Traits , 2008, Genetics.

[28]  José Crossa,et al.  Predicting Quantitative Traits With Regression Models for Dense Molecular Markers and Pedigree , 2009, Genetics.

[29]  J. Dekkers,et al.  Evaluation of linkage disequilibrium measures between multi-allelic markers as predictors of linkage disequilibrium between markers and QTL. , 2005, Genetical research.

[30]  G. Moser,et al.  Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers , 2010, Genetics Selection Evolution.

[31]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[32]  Kent A Weigel,et al.  Genome-assisted prediction of a quantitative trait measured in parents and progeny: application to food conversion rate in chickens , 2009, Genetics Selection Evolution.

[33]  Daniel Gianola,et al.  Predicting genetic predisposition in humans: the promise of whole-genome markers , 2010, Nature Reviews Genetics.

[34]  Bruce Tier,et al.  A comparison of five methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers , 2009, Genetics Selection Evolution.

[35]  B. Browning,et al.  A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. , 2009, American journal of human genetics.

[36]  P. VanRaden,et al.  Efficient methods to compute genomic predictions. , 2008, Journal of dairy science.

[37]  K. Weigel,et al.  Machine learning classification procedure for selecting SNPs in genomic selection: application to early mortality in broilers. , 2007, Developments in biologicals.

[38]  M. Goddard,et al.  A Validated Genome Wide Association Study to Breed Cattle Adapted to an Environment Altered by Climate Change , 2009, PloS one.

[39]  J. Kearney,et al.  A comparison of various methods for the computation of genomic breeding values of dairy bulls using software at genomicselection.net , 2010 .

[40]  K. Weigel,et al.  Potential gains in lifetime net merit from genomic testing of cows, heifers, and calves on commercial dairy farms. , 2012, Journal of dairy science.

[41]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[42]  V Ducrocq,et al.  Evidence of biases in genetic evaluations due to genomic preselection in dairy cattle. , 2011, Journal of dairy science.

[43]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[44]  J. Woolliams,et al.  The Accuracy of Genomic Selection in Norwegian Red Cattle Assessed by Cross-Validation , 2009, Genetics.

[45]  I Misztal,et al.  Multiple-trait genomic evaluation of linear type traits using genomic and phenotypic data in US Holsteins. , 2011, Journal of dairy science.

[46]  C. Maltecca,et al.  Genomic breeding value prediction using three Bayesian methods and application to reduced density marker panels , 2010, BMC proceedings.

[47]  Kent A Weigel,et al.  L2-Boosting algorithm applied to high-dimensional problems in genomic selection. , 2010, Genetics research.

[48]  M. Goddard,et al.  LASSO with cross-validation for genomic selection. , 2009, Genetics research.

[49]  Michael E Goddard,et al.  Accuracy of genomic selection using stochastic search variable selection in Australian Holstein Friesian dairy cattle. , 2009, Genetics research.

[50]  E. B. Burnside,et al.  Genetic evaluation for herd life in Canada. , 1998, Journal of dairy science.

[51]  P. VanRaden,et al.  Interbull validation test for genomic evaluations , 2010 .

[52]  C. Robert-Granié,et al.  Improved Lasso for genomic selection. , 2011, Genetics research.

[53]  I Misztal,et al.  Computing procedures for genetic evaluation including phenotypic, full pedigree, and genomic information. , 2009, Journal of dairy science.

[54]  M. Goddard Genomic selection: prediction of accuracy and maximisation of long term response , 2009, Genetica.

[55]  A. Legarra,et al.  A unified approach to utilize phenotypic, full pedigree and genomic information for a genetic evaluation of Holstein final score , 2009 .

[56]  Kent A Weigel,et al.  Nonparametric Methods for Incorporating Genomic Information Into Genetic Evaluations: An Application to Mortality in Broilers , 2008, Genetics.