Predictive accuracy of combined genetic and environmental risk scores

The substantial heritability of most complex diseases suggests that genetic data could provide useful risk prediction. To date the performance of genetic risk scores has fallen short of the potential implied by heritability, but this can be explained by insufficient sample sizes for estimating highly polygenic models. When risk predictors already exist based on environment or lifestyle, two key questions are to what extent can they be improved by adding genetic information, and what is the ultimate potential of combined genetic and environmental risk scores? Here, we extend previous work on the predictive accuracy of polygenic scores to allow for an environmental score that may be correlated with the polygenic score, for example when the environmental factors mediate the genetic risk. We derive common measures of predictive accuracy and improvement as functions of the training sample size, chip heritabilities of disease and environmental score, and genetic correlation between disease and environmental risk factors. We consider simple addition of the two scores and a weighted sum that accounts for their correlation. Using examples from studies of cardiovascular disease and breast cancer, we show that improvements in discrimination are generally small but reasonable degrees of reclassification could be obtained with current sample sizes. Correlation between genetic and environmental scores has only minor effects on numerical results in realistic scenarios. In the longer term, as the accuracy of polygenic scores improves they will come to dominate the predictive accuracy compared to environmental scores.

[1]  Tanya M. Teslovich,et al.  Discovery and refinement of loci associated with lipid levels , 2013, Nature Genetics.

[2]  Jbs Board Joint British Societies’ consensus recommendations for the prevention of cardiovascular disease (JBS3) , 2014, Heart.

[3]  Markus Perola,et al.  Genomic prediction of coronary heart disease , 2016, bioRxiv.

[4]  M. Inouye,et al.  Genomic risk prediction of complex human disease and its clinical application. , 2015, Current opinion in genetics & development.

[5]  Tom R. Gaunt,et al.  Sixty-Five Common Genetic Variants and Prediction of Type 2 Diabetes , 2014, Diabetes.

[6]  Ross M. Fraser,et al.  Genetic studies of body mass index yield new insights for obesity biology , 2015, Nature.

[7]  Muin J. Khoury,et al.  Family history in public health practice: a genomic tool for disease prevention and health promotion. , 2010, Annual review of public health.

[8]  A. Hingorani,et al.  Marginal role for 53 common genetic variants in cardiovascular disease prediction , 2016, Heart.

[9]  Hon-Cheong So,et al.  A Unifying Framework for Evaluating the Predictive Power of Genetic Variants Based on the Level of Heritability Explained , 2010, PLoS genetics.

[10]  Peter M Visscher,et al.  Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk. , 2009, Human molecular genetics.

[11]  Beth Wilmot,et al.  Limited Clinical Utility of a Genetic Risk Score for the Prediction of Fracture Risk in Elderly Subjects , 2015, Journal of bone and mineral research : the official journal of the American Society for Bone and Mineral Research.

[12]  D. Clayton Prediction and Interaction in Complex Disease Genetics: Experience in Type 1 Diabetes , 2009, PLoS genetics.

[13]  G A Colditz,et al.  Nurses' health study: log-incidence mathematical model of breast cancer incidence. , 1996, Journal of the National Cancer Institute.

[14]  Ayellet V. Segrè,et al.  Hundreds of variants clustered in genomic loci and biological pathways affect human height , 2010, Nature.

[15]  Jane E. Carpenter,et al.  Prediction of Breast Cancer Risk Based on Profiling With Common Genetic Variants , 2015, JNCI Journal of the National Cancer Institute.

[16]  M. Daly,et al.  An Atlas of Genetic Correlations across Human Diseases and Traits , 2015, Nature Genetics.

[17]  Stephen W Duffy,et al.  A breast cancer prediction model incorporating familial and personal risk factors , 2004, Hereditary Cancer in Clinical Practice.

[18]  David O Wilson,et al.  Lung Cancer Risk Prediction Using Common SNPs Located in GWAS-Identified Susceptibility Regions , 2015, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[19]  Kathleen F. Kerr,et al.  Net reclassification indices for evaluating risk prediction instruments: a critical review. , 2014, Epidemiology.

[20]  R. Wilkins Polygenes, risk prediction, and targeted prevention of breast cancer. , 2008, The New England journal of medicine.

[21]  A. Peters,et al.  Genetic Markers Enhance Coronary Risk Prediction in Men: The MORGAM Prospective Cohorts , 2012, PloS one.

[22]  M. García-Closas,et al.  Combined associations of genetic and environmental risk factors: implications for prevention of breast cancer. , 2014, Journal of the National Cancer Institute.

[23]  Jianxin Shi,et al.  Developing and evaluating polygenic risk prediction models for stratified disease prevention , 2016, Nature Reviews Genetics.

[24]  Ewout W Steyerberg,et al.  Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers , 2011, Statistics in medicine.

[25]  J. Barrett,et al.  Genetic risk prediction in complex disease , 2011, Human molecular genetics.

[26]  O. Franco,et al.  Incremental predictive value of 152 single nucleotide polymorphisms in the 10-year risk prediction of incident coronary heart disease: the Rotterdam Study. , 2015, International journal of epidemiology.

[27]  S. Baker Putting risk prediction in perspective: relative utility curves. , 2009, Journal of the National Cancer Institute.

[28]  N. Wray,et al.  Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance components analysis , 2015, Nature Genetics.

[29]  Ross M. Fraser,et al.  Defining the role of common variation in the genomic and biological architecture of adult human height , 2014, Nature Genetics.

[30]  P. Visscher,et al.  The Genetic Interpretation of Area under the ROC Curve in Genomic Profiling , 2010, PLoS genetics.

[31]  J. Pankow,et al.  Prediction of coronary heart disease risk using a genetic risk score: the Atherosclerosis Risk in Communities Study. , 2007, American journal of epidemiology.

[32]  Laura J. Scott,et al.  Joint Analysis of Psychiatric Disorders Increases Accuracy of Risk Prediction for Schizophrenia, Bipolar Disorder, and Major Depressive Disorder , 2015, American journal of human genetics.

[33]  W. Willett,et al.  Breast Cancer Risk From Modifiable and Nonmodifiable Risk Factors Among White Women in the United States. , 2016, JAMA oncology.

[34]  Yingye Zheng,et al.  Integrating the predictiveness of a marker with its performance as a classifier. , 2007, American journal of epidemiology.

[35]  Andres Metspalu,et al.  Genome-wide genetic homogeneity between sexes and populations for human height and body mass index. , 2015, Human molecular genetics.

[36]  Udo Hoffmann,et al.  A Genetic Risk Score Is Associated With Incident Cardiovascular Disease and Coronary Artery Calcium: The Framingham Heart Study , 2012, Circulation. Cardiovascular genetics.

[37]  P. Visscher,et al.  Estimating missing heritability for disease from genome-wide association studies. , 2011, American journal of human genetics.

[38]  Sang Hong Lee,et al.  Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood , 2012, Bioinform..

[39]  F. Dudbridge Polygenic Epidemiology , 2016, Genetic epidemiology.

[40]  Karla Kerlikowske,et al.  Prospective breast cancer risk prediction model for women undergoing screening mammography. , 2006, Journal of the National Cancer Institute.

[41]  J. Danesh,et al.  Large-scale association analysis identifies new risk loci for coronary artery disease , 2013 .

[42]  Jennifer G. Robinson,et al.  2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk: A Report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines , 2014, Circulation.

[43]  V. Salomaa,et al.  Genetic Risk Prediction and a 2-Stage Risk Screening Strategy for Coronary Heart Disease , 2013, Arteriosclerosis, thrombosis, and vascular biology.

[44]  Nicholas Eriksson,et al.  Comparison of Family History and SNPs for Predicting Risk of Complex Disease , 2012, PLoS genetics.

[45]  M. Weedon,et al.  Type 1 Diabetes Genetic Risk Score: A Novel Tool to Discriminate Monogenic and Type 1 Diabetes , 2016, Diabetes.

[46]  Matthew C Keller,et al.  Recent methods for polygenic analysis of genome-wide data implicate an important effect of common variants on cardiovascular disease risk , 2011, BMC Medical Genetics.

[47]  Jaana M. Hartikainen,et al.  Large-scale genotyping identifies 41 new loci associated with breast cancer risk , 2013, Nature Genetics.

[48]  Carol Coupland,et al.  Derivation, validation, and evaluation of a new QRISK model to estimate lifetime risk of cardiovascular disease: cohort study using QResearch database , 2010, BMJ : British Medical Journal.

[49]  N. Cook Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction , 2007, Circulation.

[50]  M. Gail,et al.  Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. , 1989, Journal of the National Cancer Institute.

[51]  Jennifer G. Robinson,et al.  2013 ACC/AHA Guideline on the Assessment of Cardiovascular Risk: A Report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines , 2014, Circulation.

[52]  Nilanjan Chatterjee,et al.  Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies , 2013, Nature Genetics.

[53]  K. Kendler,et al.  The heritability of alcohol use disorders: a meta-analysis of twin and adoption studies , 2014, Psychological Medicine.

[54]  Douglas F. Easton,et al.  Polygenic susceptibility to breast cancer and implications for prevention , 2002, Nature Genetics.

[55]  Jianjun Liu,et al.  Breast cancer risk prediction and individualised screening based on common genetic variation and breast density measurement , 2011, Breast Cancer Research.

[56]  B. Ponder,et al.  Polygenes, risk prediction, and targeted prevention of breast cancer. , 2008, The New England journal of medicine.

[57]  M. Pencina,et al.  Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond , 2008, Statistics in medicine.

[58]  F. Dudbridge Power and Predictive Accuracy of Polygenic Risk Scores , 2013, PLoS genetics.

[59]  M. Thun,et al.  Performance of Common Genetic Variants in Breast-cancer Risk Models , 2022 .

[60]  D. Easton,et al.  The BOADICEA model of genetic susceptibility to breast and ovarian cancer , 2004, British Journal of Cancer.

[61]  A. Hofman,et al.  Predicting human height by Victorian and genomic methods , 2009, European Journal of Human Genetics.

[62]  Peter M Visscher,et al.  Prediction of individual genetic risk to disease from genome-wide association studies. , 2007, Genome research.

[63]  M. Pencina,et al.  General Cardiovascular Risk Profile for Use in Primary Care: The Framingham Heart Study , 2008, Circulation.

[64]  Frank Dudbridge,et al.  A Fast Method that Uses Polygenic Scores to Estimate the Variance Explained by Genome-wide Marker Panels and the Proportion of Variants Affecting a Trait. , 2015, American journal of human genetics.

[65]  Christian Gieger,et al.  Thirty new loci for age at menarche identified by a meta-analysis of genome-wide association studies , 2010, Nature Genetics.

[66]  Jing Fan,et al.  The Net Reclassification Index (NRI): A Misleading Measure of Prediction Improvement Even with Independent Test Data Sets , 2015, Statistics in biosciences.