Why Breeding Values Estimated Using Familial Data Should Not Be Used for Genome-Wide Association Studies

In animal breeding, the genetic potential of an animal is summarized as its estimated breeding value, which is derived from its own performance as well as the performance of related individuals. Here, we illustrate why estimated breeding values are not suitable as a phenotype for genome-wide association studies. We simulated human-type and pig-type pedigrees with a range of quantitative trait loci (QTL) effects (0.5–3% of phenotypic variance) and heritabilities (0.3−0.8). We analyzed 1000 replicates of each scenario with four models: (a) a full mixed model including a polygenic effect, (b) a regression analysis using the residual of a mixed model as a trait score (so called GRAMMAR approach), (c) a regression analysis using the estimated breeding value as a trait score, and (d) a regression analysis that uses the raw phenotype as a trait score. We show that using breeding values as a trait score gives very high false-positive rates (up 14% in human pedigrees and >60% in pig pedigrees). Simulations based on a real pedigree show that additional generations of pedigree increase the type I error. Including the family relationship as a random effect provides the greatest power to detect QTL while controlling for type I error at the desired level and providing the most accurate estimates of the QTL effect. Both the use of residuals and the use of breeding values result in deflated estimates of the QTL effect. We derive the contributions of QTL effects to the breeding value and residual and show how this affects the estimates.

[1]  J. Slate,et al.  Genome‐wide association mapping identifies the genetic basis of discrete and quantitative variation in sexual weaponry in a wild sheep population , 2011, Molecular ecology.

[2]  Tatiana I Axenovich,et al.  Rapid variance components–based method for whole-genome association analysis , 2012, Nature Genetics.

[3]  M. McMullen,et al.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness , 2006, Nature Genetics.

[4]  Örjan Carlborg,et al.  Overview – dataset comparison II Comparison of analyses of the QTLMAS XII common dataset . II : genome-wide association and fine mapping , 2009 .

[5]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[6]  T. Leeb,et al.  A Genome-Wide Association Study to Detect QTL for Commercially Important Traits in Swiss Large White Boars , 2013, PloS one.

[7]  Ellen M Wijsman,et al.  MCMC Multilocus Lod Scores: Application of a New Approach , 2005, Human Heredity.

[8]  C. Haley,et al.  GRAMMAR: a fast and simple method for genome-wide pedigree-based quantitative trait loci association analysis , 2007 .

[9]  Yurii S. Aulchenko,et al.  BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btm108 Genetics and population analysis GenABEL: an R library for genome-wide association analysis , 2022 .

[10]  Robin Thompson,et al.  ASREML user guide release 1.0 , 2002 .

[11]  D. Heckerman,et al.  Efficient Control of Population Structure in Model Organism Association Mapping , 2008, Genetics.

[12]  P. Zambonelli,et al.  Association mapping of quantitative trait loci for carcass and meat quality traits at the central part of chromosome 2 in Italian Large White pigs. , 2013, Meat science.

[13]  R. Fernando,et al.  Deregressing estimated breeding values and weighting information for genomic regression analyses , 2009, Genetics Selection Evolution.

[14]  Zhiwu Zhang,et al.  Mixed linear model approach adapted for genome-wide association studies , 2010, Nature Genetics.