Assessment of bagging GBLUP for whole-genome prediction of broiler chicken traits.

Bootstrap aggregation (bagging) is a resampling method known to produce more accurate predictions when predictors are unstable or when the number of markers is much larger than sample size, because of variance reduction capabilities. The purpose of this study was to compare genomic best linear unbiased prediction (GBLUP) with bootstrap aggregated sampling GBLUP (Bagged GBLUP, or BGBLUP) in terms of prediction accuracy. We used a 600 K Affymetrix platform with 1351 birds genotyped and phenotyped for three traits in broiler chickens; body weight, ultrasound measurement of breast muscle and hen house egg production. The predictive performance of GBLUP versus BGBLUP was evaluated in different scenarios consisting of including or excluding the TOP 20 markers from a standard genome-wide association study (GWAS) as fixed effects in the GBLUP model, and varying training sample sizes and allelic frequency bins. Predictive performance was assessed via five replications of a threefold cross-validation using the correlation between observed and predicted values, and prediction mean-squared error. GBLUP overfitted the training set data, and BGBLUP delivered a better predictive ability in testing sets. Treating the TOP 20 markers from the GWAS into the model as fixed effects improved prediction accuracy and added advantages to BGBLUP over GBLUP. The performance of GBLUP and BGBLUP at different allele frequency bins and training sample sizes was similar. In general, results of this study confirm that BGBLUP can be valuable for enhancing genome-enabled prediction of complex traits.

[1]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[2]  R. Ball,et al.  Bayesian methods for quantitative trait loci mapping based on model selection: approximate analysis using the Bayesian information criterion. , 2001, Genetics.

[3]  Robert Clarke,et al.  Identifying protein interaction subnetworks by a bagging Markov random field-based method , 2012, Nucleic acids research.

[4]  Zhe Zhang,et al.  Improving the Accuracy of Whole Genome Prediction for Complex Traits Using the Results of Genome Wide Association Studies , 2014, PloS one.

[5]  O. González-Recio,et al.  Genome-wide prediction of discrete traits using bayesian regressions and machine learning , 2011, Genetics Selection Evolution.

[6]  D. Gianola Priors in Whole-Genome Regression: The Bayesian Alphabet Returns , 2013, Genetics.

[7]  Andrew E. Jaffe,et al.  Gene set bagging for estimating the probability a statistically significant result will replicate , 2013, BMC Bioinformatics.

[8]  Frank Emmert-Streib,et al.  Bagging Statistical Network Inference from Large-Scale Gene Expression Data , 2012, PloS one.

[9]  B. Harris,et al.  Experiences with the Illumina high density bovine BeadChip , 2011 .

[10]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[11]  John D. Storey,et al.  Gene set bagging for estimating replicability of gene set analyses , 2013, 1301.3933.

[12]  Claudio Moraga,et al.  Two Bagging Algorithms with Coupled Learners to Encourage Diversity , 2007, IDA.

[13]  Ashwani Jha,et al.  miR-BAG: Bagging Based Identification of MicroRNA Precursors , 2012, PloS one.

[14]  P. VanRaden,et al.  Distribution and location of genetic effects for dairy traits. , 2009, Journal of dairy science.

[15]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[16]  C Greco,et al.  Genome-wide signatures of population bottlenecks and diversifying selection in European wolves , 2013, Heredity.

[17]  A. Nejati-Javaremi,et al.  Effect of total allelic relationship on accuracy of evaluation and response to selection. , 1997, Journal of animal science.

[18]  I Misztal,et al.  A relationship matrix including full pedigree and genomic information. , 2009, Journal of dairy science.

[19]  Xiao-Lin Wu,et al.  A Primer on High-Throughput Computing for Genomic Selection , 2011, Front. Gene..

[20]  D. Gianola,et al.  Dissection of additive genetic variability for quantitative traits in chickens using SNP markers. , 2014, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[21]  Martin S. Taylor,et al.  Genome-wide genetic association of complex traits in heterogeneous stock mice , 2006, Nature Genetics.

[22]  K. Weigel,et al.  Enhancing Genome-Enabled Prediction by Bagging Genomic BLUP , 2014, PloS one.

[23]  Ivan Chorbev,et al.  Applying Bagging Techniques to the SA Tabu Miner Rule Induction Algorithm , 2009, ICT Innovations.

[24]  J. Woolliams,et al.  Edinburgh Research Explorer Development of a high density 600K SNP genotyping array for chicken , 2022 .

[25]  M. Calus,et al.  Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding , 2013, Genetics.

[26]  C. R. Henderson Comparison of Alternative Sire Evaluation Methods , 1975 .

[27]  P. VanRaden,et al.  Efficient methods to compute genomic predictions. , 2008, Journal of dairy science.