Powerful and Adaptive Testing for Multi-trait and Multi-SNP Associations with GWAS and Sequencing Data

Testing for genetic association with multiple traits has become increasingly important, not only because of its potential to boost statistical power, but also for its direct relevance to applications. For example, there is accumulating evidence showing that some complex neurodegenerative and psychiatric diseases like Alzheimer’s disease are due to disrupted brain networks, for which it would be natural to identify genetic variants associated with a disrupted brain network, represented as a set of multiple traits, one for each of multiple brain regions of interest. In spite of its promise, testing for multivariate trait associations is challenging: if not appropriately used, its power can be much lower than testing on each univariate trait separately (with a proper control for multiple testing). Furthermore, differing from most existing methods for single-SNP–multiple-trait associations, we consider SNP set-based association testing to decipher complicated joint effects of multiple SNPs on multiple traits. Because the power of a test critically depends on several unknown factors such as the proportions of associated SNPs and of traits, we propose a highly adaptive test at both the SNP and trait levels, giving higher weights to those likely associated SNPs and traits, to yield high power across a wide spectrum of situations. We illuminate relationships among the proposed and some existing tests, showing that the proposed test covers several existing tests as special cases. We compare the performance of the new test with that of several existing tests, using both simulated and real data. The methods were applied to structural magnetic resonance imaging data drawn from the Alzheimer’s Disease Neuroimaging Initiative to identify genes associated with gray matter atrophy in the human brain default mode network (DMN). For genome-wide association studies (GWAS), genes AMOTL1 on chromosome 11 and APOE on chromosome 19 were discovered by the new test to be significantly associated with the DMN. Notably, gene AMOTL1 was not detected by single SNP-based analyses. To our knowledge, AMOTL1 has not been highlighted in other Alzheimer’s disease studies before, although it was indicated to be related to cognitive impairment. The proposed method is also applicable to rare variants in sequencing data and can be extended to pathway analysis.

[1]  Brian H. McArdle,et al.  FITTING MULTIVARIATE MODELS TO COMMUNITY DATA: A COMMENT ON DISTANCE‐BASED REDUNDANCY ANALYSIS , 2001 .

[2]  Xihong Lin,et al.  Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed Models , 2007, Biometrics.

[3]  J. Andrews-Hanna,et al.  The brain's default network: Anatomy, function, and consequence of disruption , 2009 .

[4]  R. Haase Partitioning the SSCP, Measures of Strength of Association, and Test Statistics , 2011 .

[5]  Wei Pan,et al.  Relationship between genomic distance‐based regression and kernel machine regression for multi‐marker association testing , 2011, Genetic epidemiology.

[6]  Robert C. Green,et al.  Genome-wide association study of the rate of cognitive decline in Alzheimer's disease , 2014, Alzheimer's & Dementia.

[7]  Arnab Maity,et al.  Multivariate Phenotype Association Analysis by Marker‐Set Kernel Machine Regression , 2012, Genetic epidemiology.

[8]  Kai Wang,et al.  A principal components regression approach to multilocus genetic association studies , 2008, Genetic epidemiology.

[9]  Wei Pan,et al.  Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data , 2014, NeuroImage.

[10]  Heping Zhang,et al.  Genetic Association Test for Multiple Traits at Gene Level , 2013, Genetic epidemiology.

[11]  Xihong Lin,et al.  Rare Variant Association Testing for Sequencing Data Using the Sequence Kernel Association Test ( SKAT ) , 2011 .

[12]  Bin Zhao,et al.  Cardiovascular disease contributes to Alzheimer's disease: evidence from large-scale genome-wide association studies , 2014, Neurobiology of Aging.

[13]  Aribert Rothenberger,et al.  Conduct disorder and ADHD: Evaluation of conduct problems as a categorical and quantitative trait in the international multicentre ADHD genetics study , 2008, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[14]  M. Ko,et al.  Global gene expression analysis identifies molecular pathways distinguishing blastocyst dormancy and activation. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[15]  N. Schork,et al.  Generalized genomic distance-based regression methodology for multilocus association analysis. , 2006, American journal of human genetics.

[16]  Nicholas J. Schork,et al.  Statistical Properties of Multivariate Distance Matrix Regression for High-Dimensional Data Analysis , 2012, Front. Gene..

[17]  W. Pan,et al.  A Powerful Pathway-Based Adaptive Test for Genetic Association with Common or Rare Variants. , 2015, American journal of human genetics.

[18]  David T. Jones,et al.  Age-related changes in the default mode network are more advanced in Alzheimer disease , 2011, Neurology.

[19]  Michael Weiner,et al.  Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: A study of the ADNI cohort , 2010, NeuroImage.

[20]  Jiang Li,et al.  MGAS: a powerful tool for multivariate gene-based genome-wide association analysis , 2014, Bioinform..

[21]  Steven J. M. Jones,et al.  Non-coding-regulatory regions of human brain genes delineated by bacterial artificial chromosome knock-in mice , 2013, BMC Biology.

[22]  Manuel A. R. Ferreira,et al.  Genetics and population analysis A multivariate test of association , 2009 .

[23]  Sudha Seshadri,et al.  Genome-wide analysis of genetic loci associated with Alzheimer disease. , 2010, JAMA.

[24]  R. Green,et al.  Genetic studies of quantitative MCI and AD phenotypes in ADNI: Progress, opportunities, and plans , 2015, Alzheimer's & Dementia.

[25]  M. E. El Zowalaty,et al.  Common and Rare Genetic Variants Associated With Alzheimer's Disease , 2015, Journal of cellular physiology.

[26]  Giovanni Coppola,et al.  Gender Modulates the APOE ε4 Effect in Healthy Older Adults: Convergent Evidence from Functional Brain Connectivity and Spinal Fluid Tau Levels , 2012, The Journal of Neuroscience.

[27]  Jason H. Moore,et al.  Genetic analysis of quantitative phenotypes in AD and MCI: imaging, cognition and biomarkers , 2013, Brain Imaging and Behavior.

[28]  Momiao Xiong,et al.  Pleiotropy Analysis of Quantitative Traits at Gene Level by Multivariate Functional Linear Models , 2015, Genetic epidemiology.

[29]  Norbert Schuff,et al.  Large-scale genomics unveil polygenic architecture of human cortical surface area , 2015, Nature Communications.

[30]  R. Lyngsoe G. Hellenthal,et al.  Genome-wide association analysis , 2007 .

[31]  K. Taylor,et al.  Genome-Wide Association , 2007, Diabetes.

[32]  Martin Styner,et al.  Projection Regression Models for Multivariate Imaging Phenotype , 2012, Genetic epidemiology.

[33]  Keith E. Muller,et al.  Practical methods for computing power in testing the multivariate general linear hypothesis , 1984 .

[34]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[35]  Xiaotong Shen,et al.  A Powerful and Adaptive Association Test for Rare Variants , 2014, Genetics.

[36]  Vince D. Calhoun,et al.  Multivariate analysis reveals genetic associations of the resting default mode network in psychotic bipolar disorder and schizophrenia , 2014, Proceedings of the National Academy of Sciences.

[37]  D. Schacter,et al.  The Brain's Default Network , 2008, Annals of the New York Academy of Sciences.

[38]  P. Fox,et al.  Genetic control over the resting brain , 2010, Proceedings of the National Academy of Sciences.

[39]  Xihong Lin,et al.  GEE‐Based SNP Set Association Test for Continuous and Discrete Traits in Family‐Based Association Studies , 2013, Genetic epidemiology.

[40]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[41]  Seth Love,et al.  Genetic Evidence Implicates the Immune System and Cholesterol Metabolism in the Aetiology of Alzheimer's Disease , 2010, PloS one.

[42]  P. Crane,et al.  Alzheimer’s Disease: Analyzing the Missing Heritability , 2013, PloS one.

[43]  Richard F. Haase,et al.  Multivariate General Linear Models , 2011 .

[44]  Daniel J Schaid,et al.  Nonparametric tests of association of multiple genes with human disease. , 2005, American journal of human genetics.

[45]  Eric Achten,et al.  Dysfunctional modulation of default mode network activity in attention-deficit/hyperactivity disorder. , 2015, Journal of abnormal psychology.

[46]  Michael Boehnke,et al.  LocusZoom: regional visualization of genome-wide association scan results , 2010, Bioinform..

[47]  Fernando Cendes,et al.  ALZHEIMER'S AS A DEFAULT MODE NETWORK DISEASE: A GREY MATTER, FUNCTIONAL, AND STRUCTURAL CONNECTIVITY STUDY , 2014, Alzheimer's & Dementia.

[48]  Jung-Ying Tzeng,et al.  Studying gene and gene-environment effects of uncommon and common variants on continuous traits: a marker-set approach using gene-trait similarity regression. , 2011, American journal of human genetics.

[49]  Kathryn Roeder,et al.  Pleiotropy and principal components of heritability combine to increase power for association analysis , 2008, Genetic epidemiology.

[50]  David C Christiani,et al.  Genome-wide association analysis for multiple continuous secondary phenotypes. , 2013, American journal of human genetics.

[51]  Wei Wang,et al.  MaCH‐Admix: Genotype Imputation for Admixed Populations , 2013, Genetic epidemiology.

[52]  I. Lombardo,et al.  The efficacy of RVT-101, a 5-ht6 receptor antagonist, as an adjunct to donepezil in adults with mild-to-moderate Alzheimer’s disease: Completer analysis of a phase 2b study , 2015, Alzheimer's & Dementia.

[53]  F. Cendes,et al.  Alzheimer as a Default Mode Network Disease: A Grey Matter, Functional and Structural Connectivity Study (P6.324) , 2014 .

[54]  A. Goate,et al.  Alzheimer’s Disease Genetics: From the Bench to the Clinic , 2014, Neuron.

[55]  W. Thies,et al.  2013 Alzheimer's disease facts and figures , 2013, Alzheimer's & Dementia.

[56]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[57]  Alan C. Evans,et al.  Neuronal Networks in Alzheimer's Disease , 2009, The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry.

[58]  Peter Kraft,et al.  Maximizing the power of principal-component analysis of correlated phenotypes in genome-wide association studies. , 2014, American journal of human genetics.

[59]  M. Greicius,et al.  Default-mode network activity distinguishes Alzheimer's disease from healthy aging: Evidence from functional MRI , 2004, Proc. Natl. Acad. Sci. USA.

[60]  Andrew J. Saykin,et al.  Gene-based GWAS and biological pathway analysis of the resilience of executive functioning , 2013, Brain Imaging and Behavior.

[61]  L. Gallo Cardiovascular Disease , 1995, GWUMC Department of Biochemistry Annual Spring Symposia.

[62]  Manuel A. R. Ferreira,et al.  A gene-based test of association using canonical correlation analysis , 2012, Bioinform..

[63]  M A Pericak-Vance,et al.  Genome-wide association study of Alzheimer's disease , 2012, Translational Psychiatry.

[64]  E. Ingelsson,et al.  Genome‐wide and gene‐based association implicates FRMD6 in alzheimer disease , 2012, Human mutation.