Components of the accuracy of genomic prediction in a multi-breed sheep population.

In genome-wide association studies, failure to remove variation due to population structure results in spurious associations. In contrast, for predictions of future phenotypes or estimated breeding values from dense SNP data, exploiting population structure arising from relatedness can actually increase the accuracy of prediction in some cases, for example, when the selection candidates are offspring of the reference population where the prediction equation was derived. In populations with large effective population size or with multiple breeds and strains, it has not been demonstrated whether and when accounting for or removing variation due to population structure will affect the accuracy of genomic prediction. Our aim in this study was to determine whether accounting for population structure would increase the accuracy of genomic predictions, both within and across breeds. First, we have attempted to decompose the accuracy of genomic prediction into contributions from population structure or linkage disequilibrium (LD) between markers and QTL using a diverse multi-breed sheep (Ovis aries) data set, genotyped for 48,640 SNP. We demonstrate that SNP from a single chromosome can achieve up to 86% of the accuracy for genomic predictions using all SNP. This result suggests that most of the prediction accuracy is due to population structure, because a single chromosome is expected to capture relationships but is unlikely to contain all QTL. We then explored principal component analysis (PCA) as an approach to disentangle the respective contributions of population structure and LD between SNP and QTL to the accuracy of genomic predictions. Results showed that fitting an increasing number of principle components (PC; as covariates) decreased within breed accuracy until a lower plateau was reached. We speculate that this plateau is a measure of the accuracy due to LD. In conclusion, a large proportion of the accuracy for genomic predictions in our data was due to variation associated with population structure. Surprisingly, accounting for this structure generally decreased the accuracy of across breed genomic predictions.

[1]  P. Lichtner,et al.  The impact of genetic relationship information on genomic breeding values in German Holstein cattle , 2010, Genetics Selection Evolution.

[2]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[3]  Alkes L. Price,et al.  New approaches to population stratification in genome-wide association studies , 2010, Nature Reviews Genetics.

[4]  B. Kinghorn,et al.  Design and role of an information nucleus in sheep breeding programs , 2010 .

[5]  B. Hayes,et al.  Accuracy of estimated genomic breeding values for wool and meat traits in a multi-breed sheep population , 2010 .

[6]  B. Browning,et al.  A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. , 2009, American journal of human genetics.

[7]  S. Gabriel,et al.  Assessing the impact of population stratification on genetic association studies , 2004, Nature Genetics.

[8]  M. Goddard,et al.  Mapping genes for complex traits in domestic animals and their use in breeding programmes , 2009, Nature Reviews Genetics.

[9]  G. McVean A Genealogical Interpretation of Principal Components Analysis , 2009, PLoS genetics.

[10]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[11]  T. Meuwissen,et al.  Accuracy of breeding values of 'unrelated' individuals predicted by dense SNP genotyping , 2009, Genetics Selection Evolution.

[12]  M. Goddard,et al.  Reliability of Genomic Predictions Across Multiple Populations , 2009, Genetics.

[13]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[14]  J. E. Edwards,et al.  Preliminary estimates of genetic parameters for carcass and meat quality traits in Australian sheep , 2010 .

[15]  P. Donnelly,et al.  The effects of human population structure on large genetic association studies , 2004, Nature Genetics.

[16]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[17]  M. Goddard,et al.  Power of a genome scan to detect and locate quantitative trait loci in cattle using dense single nucleotide polymorphisms. , 2010, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[18]  Michael E Goddard,et al.  Accuracy of genomic selection using stochastic search variable selection in Australian Holstein Friesian dairy cattle. , 2009, Genetics research.

[19]  F. Seefried,et al.  Impacts of both reference population size and inclusion of a residual polygenic effect on the accuracy of genomic prediction , 2011, Genetics Selection Evolution.

[20]  W. G. Hill,et al.  Genome partitioning of genetic variation for complex traits using common SNPs , 2011, Nature Genetics.

[21]  Scott T. Weiss,et al.  On the Analysis of Genome-Wide Association Studies in Family-Based Designs: A Universal, Robust Analysis Approach and an Application to Four Genome-Wide Association Studies , 2009, PLoS genetics.

[22]  R. Fernando,et al.  The Impact of Genetic Relationship Information on Genome-Assisted Breeding Values , 2007, Genetics.

[23]  F. Schenkel,et al.  A genome scan to detect quantitative trait loci for economically important traits in Holstein cattle using two methods and a dense single nucleotide polymorphism map. , 2008, Journal of dairy science.

[24]  N. Fogarty,et al.  A review of genetic parameter estimates for wool, growth, meat and reproduction traits in sheep , 2005 .

[25]  Paul Scheet,et al.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. , 2006, American journal of human genetics.

[26]  M. Goddard,et al.  The distribution of SNP marker effects for faecal worm egg count in sheep, and the feasibility of using these markers to predict genetic merit for resistance to worm infections. , 2011, Genetics research.

[27]  Ben J Hayes,et al.  Accuracy of genomic breeding values in multi-breed dairy cattle populations , 2009, Genetics Selection Evolution.