Massively expedited genome-wide heritability analysis (MEGHA)

Significance Practical tools for high-dimensional heritability-based screening are invaluable for prioritizing phenotypes for genetic studies with the dramatic expansion of available phenotypic data. Classical estimates of heritability require twin or pedigree data, which can be costly and difficult to acquire. Alternative methods based on whole-genome data from unrelated individuals exist but are computationally expensive. Here we present a novel, fast, and accurate statistical method for massively expedited genome-wide heritability analysis, making heritability-based prioritization of millions of phenotypes based on data from unrelated individuals tractable for the first time to our knowledge. We apply our method to large-scale heritability analyses of brain imaging measurements and demonstrate its potential for facilitating phenome-wide analyses and characterizing the genetic architecture of complex traits. The discovery and prioritization of heritable phenotypes is a computational challenge in a variety of settings, including neuroimaging genetics and analyses of the vast phenotypic repositories in electronic health record systems and population-based biobanks. Classical estimates of heritability require twin or pedigree data, which can be costly and difficult to acquire. Genome-wide complex trait analysis is an alternative tool to compute heritability estimates from unrelated individuals, using genome-wide data that are increasingly ubiquitous, but is computationally demanding and becomes difficult to apply in evaluating very large numbers of phenotypes. Here we present a fast and accurate statistical method for high-dimensional heritability analysis using genome-wide SNP data from unrelated individuals, termed massively expedited genome-wide heritability analysis (MEGHA) and accompanying nonparametric sampling techniques that enable flexible inferences for arbitrary statistics of interest. MEGHA produces estimates and significance measures of heritability with several orders of magnitude less computational time than existing methods, making heritability-based prioritization of millions of phenotypes based on data from unrelated individuals tractable for the first time to our knowledge. As a demonstration of application, we conducted heritability analyses on global and local morphometric measurements derived from brain structural MRI scans, using genome-wide SNP data from 1,320 unrelated young healthy adults of non-Hispanic European ancestry. We also computed surface maps of heritability for cortical thickness measures and empirically localized cortical regions where thickness measures were significantly heritable. Our analyses demonstrate the unique capability of MEGHA for large-scale heritability-based screening and high-dimensional heritability profile construction.

[1]  Naomi R. Wray,et al.  Statistical Power to Detect Genetic (Co)Variance of Complex Traits Using SNP Data in Unrelated Samples , 2014, PLoS genetics.

[2]  M. Imboden,et al.  Biobanking across the phenome - at the center of chronic disease research , 2013, BMC Public Health.

[3]  Paul M. Thompson,et al.  Genetics of the connectome , 2013, NeuroImage.

[4]  Randy L. Buckner,et al.  Individual Differences in Amygdala-Medial Prefrontal Anatomy Link Negative Affect, Impaired Social Functioning, and Polygenic Depression Risk , 2012, The Journal of Neuroscience.

[5]  Thomas E. Nichols,et al.  Increasing power for voxel-wise genome-wide association studies: The random field theory, least square kernel machines and fast permutation procedures , 2012, NeuroImage.

[6]  Bruce Fischl,et al.  FreeSurfer , 2012, NeuroImage.

[7]  P. Visscher,et al.  Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs , 2012, Nature Genetics.

[8]  Bruce Fischl,et al.  A Comparison of Heritability Maps of Cortical Surface Area and Thickness and the Influence of Adjustment for Whole Brain Measures: A Magnetic Resonance Imaging Twin Study , 2012, Twin Research and Human Genetics.

[9]  S. Brunak,et al.  Mining electronic health records: towards better research applications and clinical care , 2012, Nature Reviews Genetics.

[10]  Stephan Ripke,et al.  Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs , 2012, Nature Genetics.

[11]  E. Lander,et al.  The mystery of missing heritability: Genetic interactions create phantom heritability , 2012, Proceedings of the National Academy of Sciences.

[12]  P. Fox,et al.  High Dimensional Endophenotype Ranking in the Search for Major Depression Risk Genes , 2012, Biological Psychiatry.

[13]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[14]  W. G. Hill,et al.  Genome partitioning of genetic variation for complex traits using common SNPs , 2011, Nature Genetics.

[15]  I. Kohane Using electronic health records to drive discovery in disease genomics , 2011, Nature Reviews Genetics.

[16]  P. Visscher,et al.  Estimating missing heritability for disease from genome-wide association studies. , 2011, American journal of human genetics.

[17]  Shantanu H. Joshi,et al.  The contribution of genes to cortical thickness and volume , 2011, Neuroreport.

[18]  P. Visscher,et al.  GCTA: a tool for genome-wide complex trait analysis. , 2011, American journal of human genetics.

[19]  S. Omholt,et al.  Phenomics: the next challenge , 2010, Nature Reviews Genetics.

[20]  Anderson M. Winkler,et al.  Cortical thickness or grey matter volume? The importance of selecting the phenotype for imaging genetics studies , 2010, NeuroImage.

[21]  Andrew J. Saykin,et al.  Voxelwise genome-wide association study (vGWAS) , 2010, NeuroImage.

[22]  Michael Weiner,et al.  Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: A study of the ADNI cohort , 2010, NeuroImage.

[23]  Alan C. Evans,et al.  Lateralized genetic and environmental influences on human brain morphology of 8-year-old twins , 2010, NeuroImage.

[24]  Daniel J Schaid,et al.  Genomic Similarity and Kernel Methods II: Methods for Genomic Information , 2010, Human Heredity.

[25]  Daniel J Schaid,et al.  Genomic Similarity and Kernel Methods I: Advancements by Building on Mathematical and Statistical Foundations , 2010, Human Heredity.

[26]  P. Visscher,et al.  Common SNPs explain a large proportion of the heritability for human height , 2010, Nature Genetics.

[27]  Deanne M. Taylor,et al.  Powerful SNP-set analysis for case-control genome-wide association studies. , 2010, American journal of human genetics.

[28]  P. Fox,et al.  Genetic control over the resting brain , 2010, Proceedings of the National Academy of Sciences.

[29]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[30]  W. K. Simmons,et al.  Circular analysis in systems neuroscience: the dangers of double dipping , 2009, Nature Neuroscience.

[31]  H. Pashler,et al.  Puzzlingly High Correlations in fMRI Studies of Emotion, Personality, and Social Cognition 1 , 2009, Perspectives on psychological science : a journal of the Association for Psychological Science.

[32]  Alan C. Evans,et al.  Differences in genetic and environmental influences on the human cerebral cortex associated with development during childhood and adolescence , 2009, Human brain mapping.

[33]  Xihong Lin,et al.  A powerful and flexible multilocus association test for quantitative traits. , 2008, American journal of human genetics.

[34]  Xihong Lin,et al.  Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed Models , 2007, Biometrics.

[35]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[36]  G. Molenberghs,et al.  Likelihood Ratio, Score, and Wald Tests in a Constrained Parameter Space , 2007 .

[37]  A. Meyer-Lindenberg,et al.  Intermediate phenotypes and genetic mechanisms of psychiatric disorders , 2006, Nature Reviews Neuroscience.

[38]  Anders M. Dale,et al.  An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest , 2006, NeuroImage.

[39]  Manuel A. R. Ferreira,et al.  Assumption-Free Estimation of Heritability from Genome-Wide Identity-by-Descent Sharing between Full Siblings , 2006, PLoS genetics.

[40]  J. Blangero Localization and identification of human quantitative trait loci: king harvest has surely come. , 2004, Current opinion in genetics & development.

[41]  I. Gottesman,et al.  The endophenotype concept in psychiatry: etymology and strategic intentions. , 2003, The American journal of psychiatry.

[42]  Tyrone D. Cannon,et al.  Genetic influences on brain structure , 2001, Nature Neuroscience.

[43]  Myoungshic Jhun,et al.  RANDOM PERMUTATION TESTING IN MULTIPLE LINEAR REGRESSION , 2001 .

[44]  L. Almasy,et al.  Multipoint quantitative-trait linkage analysis in general pedigrees. , 1998, American journal of human genetics.

[45]  Xihong Lin Variance component testing in generalised linear models with random effects , 1997 .

[46]  Karl J. Friston,et al.  Detecting Activations in PET and fMRI: Levels of Inference and Power , 1996, NeuroImage.

[47]  A. Carothers Methodology for Genetic Studies of Twins and Families , 1994 .

[48]  A. Thapar,et al.  Methodology for Genetic Studies of Twins and Families , 1993 .

[49]  S. S. Young,et al.  Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .

[50]  Arthur Falek,et al.  Schizophrenia and genetics: A twin study vantage point. , 1976 .

[51]  H. Grüneberg,et al.  Introduction to quantitative genetics , 1960 .

[52]  C. Jack,et al.  Alzheimer's Disease Neuroimaging Initiative , 2008 .

[53]  Karl J. Friston,et al.  Assessing the significance of focal activations using their spatial extent , 1994, Human brain mapping.