A Bayesian Antedependence Model for Whole Genome Prediction

Hierarchical mixed effects models have been demonstrated to be powerful for predicting genomic merit of livestock and plants, on the basis of high-density single-nucleotide polymorphism (SNP) marker panels, and their use is being increasingly advocated for genomic predictions in human health. Two particularly popular approaches, labeled BayesA and BayesB, are based on specifying all SNP-associated effects to be independent of each other. BayesB extends BayesA by allowing a large proportion of SNP markers to be associated with null effects. We further extend these two models to specify SNP effects as being spatially correlated due to the chromosomally proximal effects of causal variants. These two models, that we respectively dub as ante-BayesA and ante-BayesB, are based on a first-order nonstationary antedependence specification between SNP effects. In a simulation study involving 20 replicate data sets, each analyzed at six different SNP marker densities with average LD levels ranging from r2 = 0.15 to 0.31, the antedependence methods had significantly (P < 0.01) higher accuracies than their corresponding classical counterparts at higher LD levels (r2 > 0. 24) with differences exceeding 3%. A cross-validation study was also conducted on the heterogeneous stock mice data resource (http://mus.well.ox.ac.uk/mouse/HS/) using 6-week body weights as the phenotype. The antedependence methods increased cross-validation prediction accuracies by up to 3.6% compared to their classical counterparts (P < 0.001). Finally, we applied our method to other benchmark data sets and demonstrated that the antedependence methods were more accurate than their classical counterparts for genomic predictions, even for individuals several generations beyond the training data.

[1]  Daniel Gianola,et al.  "Likelihood, Bayesian, and Mcmc Methods in Quantitative Genetics" , 2010 .

[2]  C. Hoggart,et al.  Simultaneous Analysis of All SNPs in Genome-Wide and Re-Sequencing Association Studies , 2008, PLoS genetics.

[3]  M. Lund,et al.  The importance of haplotype length and heritability using genomic selection in dairy cattle. , 2009, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[4]  M. Pourahmadi,et al.  Bayesian analysis of covariance matrices and dynamic models for longitudinal data , 2002 .

[5]  C. R. Henderson Applications of linear models in animal breeding , 1984 .

[6]  M. Plummer,et al.  CODA: convergence diagnosis and output analysis for MCMC , 2006 .

[7]  Henk Bovenhuis,et al.  Sensitivity of methods for estimating breeding values using genetic markers to the number of QTL and distribution of QTL variance , 2010, Genetics Selection Evolution.

[8]  M. Goddard,et al.  Accurate Prediction of Genetic Values for Complex Traits by Whole-Genome Resequencing , 2010, Genetics.

[9]  M. Goddard,et al.  Invited review: Genomic selection in dairy cattle: progress and challenges. , 2009, Journal of dairy science.

[10]  Naomi R. Wray,et al.  Estimating Effects and Making Predictions from Genome-Wide Marker Data , 2010, 1010.4710.

[11]  W. G. Hill,et al.  Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits , 2008, PLoS genetics.

[12]  D. Zimmerman,et al.  Antedependence Models for Longitudinal Data , 2009 .

[13]  Sang Hong Lee,et al.  Predicting Unobserved Phenotypes for Complex Traits from Whole-Genome SNP Data , 2008, PLoS genetics.

[14]  A. Gelman Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper) , 2004 .

[15]  M. Goddard,et al.  Mapping genes for complex traits in domestic animals and their use in breeding programmes , 2009, Nature Reviews Genetics.

[16]  Aaron J. Lorenz,et al.  Genomic Selection in Plant Breeding , 2011 .

[17]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[18]  T A Cooper,et al.  The genomic evaluation system in the United States: past, present, future. , 2011, Journal of dairy science.

[19]  Theo H. E. Meuwissen,et al.  Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers , 2010, BMC Bioinformatics.

[20]  Kadir Kizilkaya,et al.  A general approach to mixed effects modeling of residual variances in generalized linear mixed models , 2005, Genetics Selection Evolution.

[21]  M. Goddard,et al.  Linkage Disequilibrium and Persistence of Phase in Holstein–Friesian, Jersey and Angus Cattle , 2008, Genetics.

[22]  J. Woolliams,et al.  The Impact of Genetic Architecture on Genome-Wide Evaluation Methods , 2010, Genetics.

[23]  Jean-Luc Jannink,et al.  Genomic selection in plant breeding. , 2014, Methods in molecular biology.

[24]  J. Hickey,et al.  Simulated Data for Genomic Selection and Genome-Wide Association Studies Using a Combination of Coalescent and Gene Drop Methods , 2012, G3: Genes | Genomes | Genetics.

[25]  Martin S. Taylor,et al.  Genome-wide genetic association of complex traits in heterogeneous stock mice , 2006, Nature Genetics.

[26]  Nora M Bello,et al.  Hierarchical Bayesian modeling of random and residual variance–covariance matrices in bivariate mixed effects models , 2010, Biometrical journal. Biometrische Zeitschrift.

[27]  R. Fernando,et al.  Extent and consistency of linkage disequilibrium and identification of DNA markers for production and egg quality traits in commercial layer chicken populations , 2009, BMC Genomics.

[28]  D. Gianola,et al.  On marker-assisted prediction of genetic value: beyond the ridge. , 2003, Genetics.

[29]  M. Calus,et al.  Accuracy of Genomic Selection Using Different Methods to Define Haplotypes , 2008, Genetics.

[30]  J. Jannink Likelihood of Bayesian, and MCMC Methods in Quantitative Genetics. , 2003 .

[31]  P. Gustafson,et al.  Conservative prior distributions for variance parameters in hierarchical models , 2006 .

[32]  R. O’Hara,et al.  A review of Bayesian variable selection methods: what, how and which , 2009 .

[33]  Xiao-Lin Wu,et al.  A non-parametric mixture model for genome-enabled prediction of genetic value for a quantitative trait , 2010, Genetica.

[34]  P. VanRaden,et al.  Efficient methods to compute genomic predictions. , 2008, Journal of dairy science.

[35]  Flavio S Schenkel,et al.  Characteristics of linkage disequilibrium in North American Holsteins , 2010, BMC Genomics.

[36]  Benjamin A. Logsdon,et al.  A variational Bayes algorithm for fast and accurate multiple locus genome-wide association analysis , 2010, BMC Bioinformatics.

[37]  Daniel Gianola,et al.  Additive Genetic Variability and the Bayesian Alphabet , 2009, Genetics.

[38]  N. Yi,et al.  Bayesian LASSO for Quantitative Trait Loci Mapping , 2008, Genetics.

[39]  C. R. Henderson A simple method for computing the inverse of a numerator relationship matrix used in prediction of breeding values , 1976 .

[40]  R. Fernando,et al.  Comparing Linkage Disequilibrium-Based Methods for Fine Mapping Quantitative Trait Loci , 2004, Genetics.

[41]  Andrés Legarra,et al.  Performance of Genomic Selection in Mice , 2008, Genetics.

[42]  M P L Calus,et al.  Accuracy of breeding values when using and ignoring the polygenic effect in genomic breeding value estimation with a marker density of one SNP per cM. , 2007, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[43]  José Crossa,et al.  Predicting Quantitative Traits With Regression Models for Dense Molecular Markers and Pedigree , 2009, Genetics.

[44]  A. Clutter,et al.  Characterizing Linkage Disequilibrium in Pig Populations , 2007, International journal of biological sciences.

[45]  Daniel Gianola,et al.  Predicting genetic predisposition in humans: the promise of whole-genome markers , 2010, Nature Reviews Genetics.

[46]  William Valdar,et al.  Genetic and Environmental Effects on Complex Traits in Mice , 2006, Genetics.

[47]  Rohan L. Fernando,et al.  Extension of the bayesian alphabet for genomic selection , 2011, BMC Bioinformatics.