Accuracy of Predicting the Genetic Risk of Disease Using a Genome-Wide Approach

Background The prediction of the genetic disease risk of an individual is a powerful public health tool. While predicting risk has been successful in diseases which follow simple Mendelian inheritance, it has proven challenging in complex diseases for which a large number of loci contribute to the genetic variance. The large numbers of single nucleotide polymorphisms now available provide new opportunities for predicting genetic risk of complex diseases with high accuracy. Methodology/Principal Findings We have derived simple deterministic formulae to predict the accuracy of predicted genetic risk from population or case control studies using a genome-wide approach and assuming a dichotomous disease phenotype with an underlying continuous liability. We show that the prediction equations are special cases of the more general problem of predicting the accuracy of estimates of genetic values of a continuous phenotype. Our predictive equations are responsive to all parameters that affect accuracy and they are independent of allele frequency and effect distributions. Deterministic prediction errors when tested by simulation were generally small. The common link among the expressions for accuracy is that they are best summarized as the product of the ratio of number of phenotypic records per number of risk loci and the observed heritability. Conclusions/Significance This study advances the understanding of the relative power of case control and population studies of disease. The predictions represent an upper bound of accuracy which may be achievable with improved effect estimation methods. The formulae derived will help researchers determine an appropriate sample size to attain a certain accuracy when predicting genetic risk.

[1]  Mark Daly,et al.  Haploview: analysis and visualization of LD and haplotype maps , 2005, Bioinform..

[2]  Martin S. Taylor,et al.  Genome-wide genetic association of complex traits in heterogeneous stock mice , 2006, Nature Genetics.

[3]  R. Fernando,et al.  The Impact of Genetic Relationship Information on Genome-Assisted Breeding Values , 2007, Genetics.

[4]  Alan Robertson,et al.  Inbreeding in artificial selection programmes. , 1961, Genetical research.

[5]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[6]  F. Schenkel,et al.  Extent of linkage disequilibrium in Holstein cattle in North America. , 2008, Journal of dairy science.

[7]  Peter M Visscher,et al.  Prediction of individual genetic risk to disease from genome-wide association studies. , 2007, Genome research.

[8]  J. Woolliams,et al.  Prediction of genetic contributions and generation intervals in populations with overlapping generations under selection. , 1999, Genetics.

[9]  W. Ewens Genetics and analysis of quantitative traits , 1999 .

[10]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[11]  N. Yi,et al.  Bayesian LASSO for Quantitative Trait Loci Mapping , 2008, Genetics.

[12]  M. Goddard,et al.  The distribution of the effects of genes affecting quantitative traits in livestock , 2001, Genetics Selection Evolution.

[13]  Shizhong Xu Estimating polygenic effects using markers of the entire genome. , 2003, Genetics.

[14]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[15]  L. Kruglyak,et al.  Patterns of linkage disequilibrium in the human genome , 2002, Nature Reviews Genetics.

[16]  J. Pritchard Are rare variants responsible for susceptibility to complex diseases? , 2001, American journal of human genetics.

[17]  J. Woolliams,et al.  Genomic selection using different marker types and densities. , 2008, Journal of animal science.

[18]  Douglas F. Easton,et al.  Polygenic susceptibility to breast cancer and implications for prevention , 2002, Nature Genetics.

[19]  Ann E. Loraine,et al.  Commonality of functional annotation: a method for prioritization of candidate genes from genome-wide linkage studies† , 2008, Nucleic acids research.

[20]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.

[21]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[22]  M. Goddard,et al.  The Number of Loci That Affect Milk Production Traits in Dairy Cattle , 2007, Genetics.

[23]  David R. Cox The analysis of binary data , 1970 .

[24]  J Blangero,et al.  Large upward bias in estimation of locus-specific effects from genomewide scans. , 2001, American journal of human genetics.

[25]  David M. Evans,et al.  Genome-wide association analysis identifies 20 loci that influence adult height , 2008, Nature Genetics.

[26]  A. McRae,et al.  Linkage disequilibrium in domestic sheep. , 2002, Genetics.

[27]  A. Robertson,et al.  The Heritability of All-or-None Traits: Viability of Poultry. , 1949, Genetics.

[28]  J. Dekkers Commercial application of marker- and gene-assisted selection in livestock: strategies and lessons. , 2004, Journal of animal science.

[29]  E. Lander,et al.  On the allelic spectrum of human disease. , 2001, Trends in genetics : TIG.

[30]  Cajo J F ter Braak,et al.  Extending Xu's Bayesian Model for Estimating Polygenic Effects Using Markers of the Entire Genome , 2005, Genetics.

[31]  Ewout W Steyerberg,et al.  Predictive testing for complex diseases using multiple genes: Fact or fiction? , 2006, Genetics in Medicine.

[32]  N. Risch,et al.  A note on multiple testing procedures in linkage analysis. , 1991, American journal of human genetics.

[33]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[34]  H. Grüneberg,et al.  Introduction to quantitative genetics , 1960 .