Predicting quantitative traits from genome and phenome with near perfect accuracy

In spite of decades of linkage and association studies and its potential impact on human health, reliable prediction of an individual's risk for heritable disease remains difficult. Large numbers of mapped loci do not explain substantial fractions of heritable variation, leaving an open question of whether accurate complex trait predictions can be achieved in practice. Here, we use a genome sequenced population of ∼7,000 yeast strains of high but varying relatedness, and predict growth traits from family information, effects of segregating genetic variants and growth in other environments with an average coefficient of determination R2 of 0.91. This accuracy exceeds narrow-sense heritability, approaches limits imposed by measurement repeatability and is higher than achieved with a single assay in the laboratory. Our results prove that very accurate prediction of complex traits is possible, and suggest that additional data from families rather than reference cohorts may be more useful for this purpose.

[1]  The role of genetic interactions in yeast quantitative traits , 2015 .

[2]  S. Omholt,et al.  Scan-o-matic: High-Resolution Microbial Phenomics at a Massive Scale , 2015, G3: Genes, Genomes, Genetics.

[3]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[4]  Leopold Parts,et al.  High-Resolution Mapping of Complex Traits with a Four-Parent Advanced Intercross Yeast Population , 2013, Genetics.

[5]  Leonid Kruglyak,et al.  Genetic interactions contribute less than additive effects to quantitative trait variation in yeast , 2015, Nature Communications.

[6]  Leonid Kruglyak,et al.  Dissection of genetically complex traits with extremely large pools of yeast segregants , 2010, Nature.

[7]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Peter Kraft,et al.  Genetic risk prediction--are we there yet? , 2009, The New England journal of medicine.

[9]  Raphael Mrode,et al.  Linear models for the prediction of animal breeding values , 1996 .

[10]  Doug Speed,et al.  Improved heritability estimation from genome-wide SNPs. , 2012, American journal of human genetics.

[11]  P. Visscher,et al.  Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index , 2015, Nature Genetics.

[12]  Nicholas Eriksson,et al.  Comparison of Family History and SNPs for Predicting Risk of Complex Disease , 2012, PLoS genetics.

[13]  Gary D Bader,et al.  The Genetic Landscape of a Cell , 2010, Science.

[14]  F. Dudbridge Power and Predictive Accuracy of Polygenic Risk Scores , 2013, PLoS genetics.

[15]  A. Hofman,et al.  Predicting human height by Victorian and genomic methods , 2009, European Journal of Human Genetics.

[16]  Powerful decomposition of complex traits in a diploid model using Phased Outbred Lines , 2016 .

[17]  P. Visscher,et al.  Five years of GWAS discovery. , 2012, American journal of human genetics.

[18]  Christopher J. R. Illingworth,et al.  Inferring Genome-Wide Recombination Landscapes from Advanced Intercross Lines: Application to Yeast Crosses , 2013, PloS one.

[19]  Christoph Lippert,et al.  LIMIX: genetic analysis of multiple traits , 2014, bioRxiv.

[20]  Johnny S. H. Kwan,et al.  Risk prediction of complex diseases from family history and known susceptibility loci, with applications for cancer screening. , 2011, American journal of human genetics.

[21]  Matthew B. Taylor,et al.  Transcriptional Derepression Uncovers Cryptic Higher-Order Genetic Interactions , 2015, PLoS genetics.

[22]  A. Beyer,et al.  A random forest approach to capture genetic effects in the presence of population structure , 2015, Nature Communications.

[23]  D. Allison,et al.  Beyond Missing Heritability: Prediction of Complex Traits , 2011, PLoS genetics.

[24]  B. Cohen,et al.  Epistasis in a quantitative trait captured by a molecular model of transcription factor interactions. , 2010, Theoretical population biology.

[25]  Rob Jelier,et al.  Predicting phenotypic variation in yeast from individual genome sequences , 2011, Nature Genetics.

[26]  M. Calus,et al.  Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding , 2013, Genetics.

[27]  Alan M. Moses,et al.  Revealing the genetic structure of a trait by sequencing a population under selection. , 2011, Genome research.

[28]  Adam P. Rosebrock,et al.  Heritability and genetic basis of protein level variation in an outbred population , 2014, Genome research.

[29]  Karen L. Mohlke,et al.  Genetic Risk Prediction — Are We There Yet? , 2009 .

[30]  Yakir A Reshef,et al.  Partitioning heritability by functional annotation using genome-wide association summary statistics , 2015, Nature Genetics.

[31]  Richard Durbin,et al.  Estimation of Epistatic Variance Components and Heritability in Founder Populations and Crosses , 2014, Genetics.

[32]  B. Cohen,et al.  Genetic Interactions Between Transcription Factors Cause Natural Variation in Yeast , 2009, Science.

[33]  Peter M Visscher,et al.  Prediction of individual genetic risk to disease from genome-wide association studies. , 2007, Genome research.

[34]  M. Calus,et al.  Genomic Prediction in Animals and Plants: Simulation of Data, Validation, Reporting, and Benchmarking , 2013, Genetics.

[35]  Carlo Ratti,et al.  Predictability Bounds of Electronic Health Records , 2015, Scientific Reports.

[36]  Leonid Kruglyak,et al.  Genetics of single-cell protein abundance variation in large yeast populations , 2013 .

[37]  Harald Martens,et al.  Mining for genotype-phenotype relations in Saccharomyces using partial least squares , 2011, BMC Bioinformatics.

[38]  L. Kruglyak,et al.  Finding the sources of missing heritability in a yeast cross , 2012, Nature.

[39]  F. Collins,et al.  The family history--more important than ever. , 2004, The New England journal of medicine.

[40]  P. Sham,et al.  Evaluating the heritability explained by known susceptibility variants: a survey of ten complex diseases , 2011, Genetic epidemiology.