Using information of relatives in genomic prediction to apply effective stratified medicine

Genomic prediction shows promise for personalised medicine in which diagnosis and treatment are tailored to individuals based on their genetic profiles for complex diseases. We present a theoretical framework to demonstrate that prediction accuracy can be improved by targeting more informative individuals in the data set used to generate the predictors (“discovery sample”) to include those with genetically close relationships with the subjects put forward for risk prediction. Increase of prediction accuracy from closer relationships is achieved under an additive model and does not rely on any family or interaction effects. Using theory, simulations and real data analyses, we show that the predictive accuracy or the area under the receiver operating characteristic curve (AUC) increased exponentially with decreasing effective size (Ne), i.e. when individuals are closely related. For example, with the sample size of discovery set N = 3000, heritability h2 = 0.5 and population prevalence K = 0.1, AUC value approached to 0.9 and the top percentile of the estimated genetic profile scores had 23 times higher proportion of cases than the general population. This suggests that there is considerable room to increase prediction accuracy by using a design that does not exclude closer relationships.

[1]  P. Visscher,et al.  The Genetic Interpretation of Area under the ROC Curve in Genomic Profiling , 2010, PLoS genetics.

[2]  L. Acheson,et al.  Reconsidering the family history in primary care , 2004, Journal of General Internal Medicine.

[3]  R. Fernando,et al.  Prediction of Complex Human Traits Using the Genomic Best Linear Unbiased Predictor , 2013, PLoS genetics.

[4]  Xiang Zhou,et al.  Polygenic Modeling with Bayesian Sparse Linear Mixed Models , 2012, PLoS genetics.

[5]  David N Cooper,et al.  GWAS: heritability missing in action? , 2010, European Journal of Human Genetics.

[6]  H. Daetwyler,et al.  The importance of information on relatives for the prediction of genomic breeding values and the implications for the makeup of reference data sets in livestock breeding schemes , 2012, Genetics Selection Evolution.

[7]  Doug Speed,et al.  Improved heritability estimation from genome-wide SNPs. , 2012, American journal of human genetics.

[8]  C. Sabatti,et al.  Characterizing Race/Ethnicity and Genetic Ancestry for 100,000 Subjects in the Genetic Epidemiology Research on Adult Health and Aging (GERA) Cohort , 2015, Genetics.

[9]  Manuel A. R. Ferreira,et al.  Assumption-Free Estimation of Heritability from Genome-Wide Identity-by-Descent Sharing between Full Siblings , 2006, PLoS genetics.

[10]  P. Visscher,et al.  Estimating missing heritability for disease from genome-wide association studies. , 2011, American journal of human genetics.

[11]  M. Goddard,et al.  Using the genomic relationship matrix to predict the accuracy of genomic selection. , 2011, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[12]  Laura J. Scott,et al.  Joint Analysis of Psychiatric Disorders Increases Accuracy of Risk Prediction for Schizophrenia, Bipolar Disorder, and Major Depressive Disorder , 2015, American journal of human genetics.

[13]  Alan Robertson,et al.  Inbreeding in artificial selection programmes. , 1961, Genetical research.

[14]  P. Shannon,et al.  Analysis of Genetic Inheritance in a Family Quartet by Whole-Genome Sequencing , 2010, Science.

[15]  C. Spencer,et al.  A contribution of novel CNVs to schizophrenia from a genome-wide study of 41,321 subjects: CNV Analysis Group and the Schizophrenia Working Group of the Psychiatric Genomics Consortium , 2016, bioRxiv.

[16]  B. Berger,et al.  Two variance component model improves genetic prediction in family data sets , 2015, bioRxiv.

[17]  Muin J Khoury,et al.  Family history and personal genomics as tools for improving health in an era of evidence-based medicine. , 2010, American journal of preventive medicine.

[18]  Dorret I. Boomsma,et al.  The continuing value of twin studies in the omics era , 2012, Nature Reviews Genetics.

[19]  M. Goddard,et al.  Accelerating improvement of livestock with genomic selection. , 2013, Annual review of animal biosciences.

[20]  Muin J Khoury,et al.  Can family history be used as a tool for public health and preventive medicine? , 2002, Genetics in Medicine.

[21]  Chia-Yen Chen,et al.  Improved ancestry inference using weights from external reference panels , 2013, Bioinform..

[22]  Nicholas Eriksson,et al.  Comparison of Family History and SNPs for Predicting Risk of Complex Disease , 2012, PLoS genetics.

[23]  Joseph T. Glessner,et al.  From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes , 2009, PLoS genetics.

[24]  M. Goddard,et al.  Prediction of total genetic value using genome-wide dense marker maps. , 2001, Genetics.

[25]  M. Goddard Genomic selection: prediction of accuracy and maximisation of long term response , 2009, Genetica.

[26]  C. Spencer,et al.  Biological Insights From 108 Schizophrenia-Associated Genetic Loci , 2014, Nature.

[27]  H. Hakonarson,et al.  Large sample size, wide variant spectrum, and advanced machine-learning technique boost risk prediction for inflammatory bowel disease. , 2013, American journal of human genetics.

[28]  S. Lee,et al.  The efficiency of designs for fine-mapping of quantitative trait loci using combined linkage disequilibrium and linkage , 2004, Genetics Selection Evolution.

[29]  Kenneth L Evans,et al.  The Future of Family Medicine: A Collaborative Project of the Family Medicine Community , 2004, The Annals of Family Medicine.

[30]  Greg Gibson,et al.  Rare and common variants: twenty arguments , 2012, Nature Reviews Genetics.

[31]  Naomi R. Wray,et al.  Novel Genetic Analysis for Case-Control Genome-Wide Association Studies: Quantification of Power and Genomic Prediction Accuracy , 2013, PloS one.

[32]  P. Visscher,et al.  A Better Coefficient of Determination for Genetic Profile Analysis , 2012, Genetic epidemiology.

[33]  Hans D. Daetwyler,et al.  Accuracy of Predicting the Genetic Risk of Disease Using a Genome-Wide Approach , 2008, PloS one.

[34]  P. Visscher,et al.  Common polygenic variation contributes to risk of schizophrenia and bipolar disorder , 2009, Nature.

[35]  D. English,et al.  A risk prediction algorithm based on family history and common genetic variants: application to prostate cancer with potential clinical impact , 2011, Genetic epidemiology.

[36]  J. Sved Linkage disequilibrium and homozygosity of chromosome segments in finite populations. , 1971, Theoretical population biology.

[37]  M. Calus,et al.  Reliability of direct genomic values for animals with different relationships within and to the reference population. , 2012, Journal of dairy science.

[38]  Jennifer R. Harris,et al.  Heritability of Adult Body Height: A Comparative Study of Twin Cohorts in Eight Countries , 2003, Twin Research.

[39]  Joanna Masel,et al.  Genetic drift , 2011, Current Biology.

[40]  J. Ott,et al.  Family-based designs for genome-wide association studies , 2011, Nature Reviews Genetics.

[41]  D. Allison,et al.  Beyond Missing Heritability: Prediction of Complex Traits , 2011, PLoS genetics.

[42]  John Novembre,et al.  The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research. , 2008, American journal of human genetics.

[43]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[44]  Andrés Legarra,et al.  Performance of Genomic Selection in Mice , 2008, Genetics.

[45]  Sang Hong Lee,et al.  Predicting Unobserved Phenotypes for Complex Traits from Whole-Genome SNP Data , 2008, PLoS genetics.

[46]  F. Collins,et al.  Genomic medicine--an updated primer. , 2010, The New England journal of medicine.

[47]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[48]  Peter M Visscher,et al.  Recent human effective population size estimated from linkage disequilibrium. , 2007, Genome research.

[49]  P. Visscher,et al.  Bias, precision and heritability of self-reported and clinically measured height in Australian twins , 2006, Human Genetics.

[50]  N. Wray,et al.  Research review: Polygenic methods and their application to psychiatric traits. , 2014, Journal of child psychology and psychiatry, and allied disciplines.

[51]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[52]  Qiong Yang,et al.  The Third Generation Cohort of the National Heart, Lung, and Blood Institute's Framingham Heart Study: design, recruitment, and initial examination. , 2007, American journal of epidemiology.

[53]  P. Visscher,et al.  Simultaneous Discovery, Estimation and Prediction Analysis of Complex Traits Using a Bayesian Mixture Model , 2015, PLoS genetics.

[54]  T. Insel,et al.  Why has it taken so long for biological psychiatry to develop clinical tests and what to do about it? , 2012, Molecular Psychiatry.

[55]  W. G. Hill,et al.  Genome partitioning of genetic variation for complex traits using common SNPs , 2011, Nature Genetics.

[56]  S. Ferrari,et al.  Author contributions , 2021 .

[57]  E. Lander,et al.  The mystery of missing heritability: Genetic interactions create phantom heritability , 2012, Proceedings of the National Academy of Sciences.

[58]  Seung Hwan Lee,et al.  MTG2: an efficient algorithm for multivariate linear mixed model analysis based on genomic information , 2015, bioRxiv.

[59]  F. Dudbridge Power and Predictive Accuracy of Polygenic Risk Scores , 2013, PLoS genetics.

[60]  A. Hofman,et al.  Predicting human height by Victorian and genomic methods , 2009, European Journal of Human Genetics.

[61]  Oliver A. Ryder,et al.  Pedigree analysis by computer simulation , 1986 .

[62]  B. Benyamin,et al.  EigenGWAS: finding loci under selection through genome-wide association studies of eigenvectors in structured populations , 2015, Heredity.