Accurate Prediction of Genetic Values for Complex Traits by Whole-Genome Resequencing

Whole-genome resequencing technology has improved rapidly during recent years and is expected to improve further such that the sequencing of an entire human genome sequence for $1000 is within reach. Our main aim here is to use whole-genome sequence data for the prediction of genetic values of individuals for complex traits and to explore the accuracy of such predictions. This is relevant for the fields of plant and animal breeding and, in human genetics, for the prediction of an individual's risk for complex diseases. Here, population history and genomic architectures were simulated under the Wright–Fisher population and infinite-sites mutation model, and prediction of genetic value was by the genomic selection approach, where a Bayesian nonlinear model was used to predict the effects of individual SNPs. The Bayesian model assumed a priori that only few SNPs are causative, i.e., have an effect different from zero. When using whole-genome sequence data, accuracies of prediction of genetic value were >40% increased relative to the use of dense ∼30K SNP chips. At equal high density, the inclusion of the causative mutations yielded an extra increase of accuracy of 2.5–3.7%. Predictions of genetic value remained accurate even when the training and evaluation data were 10 generations apart. Best linear unbiased prediction (BLUP) of SNP effects does not take full advantage of the genome sequence data, and nonlinear predictions, such as the Bayesian method used here, are needed to achieve maximum accuracy. On the basis of theoretical work, the results could be extended to more realistic genome and population sizes.

[1]  C. Fenster,et al.  Quantitative trait locus analyses and the study of evolutionary process , 2004, Molecular ecology.

[2]  H. Grüneberg,et al.  Introduction to quantitative genetics , 1960 .

[3]  M. Goddard,et al.  The distribution of the effects of genes affecting quantitative traits in livestock , 2001, Genetics Selection Evolution.

[4]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[5]  M. Kimura The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. , 1969, Genetics.

[6]  M. Goddard Genomic selection: prediction of accuracy and maximisation of long term response , 2009, Genetica.

[7]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[8]  P. VanRaden,et al.  Invited review: reliability of genomic predictions for North American Holstein bulls. , 2009, Journal of dairy science.

[9]  T. Meuwissen,et al.  Accuracy of breeding values of 'unrelated' individuals predicted by dense SNP genotyping , 2009, Genetics Selection Evolution.

[10]  J. Kingman On the genealogy of large populations , 1982, Journal of Applied Probability.

[11]  M. Goddard The use of high density genotyping in animal health. , 2008, Developments in biologicals.

[12]  R. Fernando,et al.  The Impact of Genetic Relationship Information on Genome-Assisted Breeding Values , 2007, Genetics.

[13]  Paul Scheet,et al.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. , 2006, American journal of human genetics.

[14]  P. Visscher,et al.  Increased accuracy of artificial selection by using the realized relationship matrix. , 2009, Genetics research.

[15]  Cornelia M van Duijn,et al.  Genome-based prediction of common diseases: advances and prospects. , 2008, Human molecular genetics.

[16]  D. Pomp,et al.  A large-sample QTL study in mice: I. Growth , 2004, Mammalian Genome.

[17]  A. C. Sørensen,et al.  Inbreeding in Danish dairy cattle breeds. , 2005, Journal of dairy science.

[18]  E. Mardis Next-generation DNA sequencing methods. , 2008, Annual review of genomics and human genetics.

[19]  Hans D. Daetwyler,et al.  Accuracy of Predicting the Genetic Risk of Disease Using a Genome-Wide Approach , 2008, PloS one.

[20]  W. Grody,et al.  Keeping up with the next generation: massively parallel sequencing in clinical diagnostics. , 2008, The Journal of molecular diagnostics : JMD.

[21]  W. G. Hill,et al.  Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits , 2008, PLoS genetics.

[22]  M. Goddard,et al.  Mapping genes for complex traits in domestic animals and their use in breeding programmes , 2009, Nature Reviews Genetics.

[23]  Judy H. Cho,et al.  Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease , 2008, Nature Genetics.

[24]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[25]  C. Hoggart,et al.  Sequence-Level Population Simulations Over Large Genomic Regions , 2007, Genetics.