Novel Variance-Component TWAS method for studying complex human diseases with applications to Alzheimer’s dementia

Transcriptome-wide association studies (TWAS) have been widely used to integrate transcriptomic and genetic data to study complex human diseases. Within a test dataset lacking transcriptomic data, existing TWAS methods first impute gene expression by creating a weighted sum that aggregates SNPs with their corresponding cis-eQTL effects on reference transcriptome. Existing TWAS methods then employ a linear regression model to assess the association between imputed gene expression and test phenotype, thereby assuming the effect of a cis-eQTL SNP on test phenotype is a linear function of the eQTL’s estimated effect on reference transcriptome. To increase TWAS robustness to this assumption, we propose a novel Variance-Component TWAS procedure (VC-TWAS) that assumes the effects of cis-eQTL SNPs on phenotype are random (with variance proportional to corresponding reference cis-eQTL effects) rather than fixed. VC-TWAS is applicable to both continuous and dichotomous phenotypes, as well as individual-level and summary-level GWAS data. Using simulated data, we show VC-TWAS is more powerful than traditional TWAS especially when eQTL genetic effects on test phenotype are no longer a linear function of their eQTL genetic effects on reference transcriptome. We further applied VC-TWAS to both individual-level (N=∼3.4K) and summary-level (N=∼54K) GWAS data to study Alzheimer’s dementia (AD). With the individual-level data, we detected 13 significant risk genes including 6 known GWAS risk genes such as TOMM40 that were missed by existing TWAS methods. With the summary-level data, we detected 57 significant risk genes considering only cis-SNPs and 71 significant genes considering both cis- and trans- SNPs; these findings also validated our findings with the individual-level GWAS data. Our VC-TWAS method is implemented in the TIGAR tool for public use.Existing Transcriptome-wide association studies (TWAS) tools make strong assumptions about the relationships among genetic variants, transcriptome, and phenotype that may be violated in practice, thereby substantially reducing the power. Here, we propose a novel variance-component TWAS method (VC-TWAS) that relaxes these assumptions and can be implemented with both individual-level and summary-level GWAS data. Our simulation studies showed that VC-TWAS achieved higher power compared to existing TWAS methods when the underlying assumptions required by existing TWAS tools were violated. We further applied VC-TWAS to both individual-level (N=∼3.4K) and summary-level (N=∼54K) GWAS data to study Alzheimer’s dementia (AD). With individual-level data, we detected 13 significant risk genes including 6 known GWAS risk genes such as TOMM40 that were missed by existing TWAS methods. Interestingly, 5 of these genes were shown to possess significant pleiotropic effects on AD pathology phenotypes, revealing possible biological mechanisms. With summary-level data of a larger sample size, we detected 57 significant risk genes considering only cis-SNPs and 71 significant genes considering both cis- and trans- SNPs, which also validated our findings with the individual-level GWAS data. In conclusion, VC-TWAS provides an important analytic tool for identifying risk genes whose effects on phenotypes might be mediated through transcriptomes.

[1]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[2]  E. Rödel,et al.  Fisher, R. A.: Statistical Methods for Research Workers, 14. Aufl., Oliver & Boyd, Edinburgh, London 1970. XIII, 362 S., 12 Abb., 74 Tab., 40 s , 1971 .

[3]  P. Moschopoulos,et al.  The distribution function of a linear combination of chi-squares , 1984 .

[4]  Vijaya L. Melnick,et al.  Alzheimer’s Dementia , 1985, Contemporary Issues in Biomedicine, Ethics, and Society.

[5]  T. Halonen,et al.  Decreased muscarinic receptor binding in cerebral cortex and hippocampus in Alzheimer's disease. , 1987, Life sciences.

[6]  I. Grundke‐Iqbal,et al.  Phosphoprotein Phosphatase Activities in Alzheimer Disease Brain , 1993, Journal of neurochemistry.

[7]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[8]  C. Colton,et al.  NO synthase 2 (NOS2) deletion promotes multiple pathologies in a mouse model of Alzheimer's disease , 2006, Proceedings of the National Academy of Sciences.

[9]  S. Mirra,et al.  Nuclear Pore Complex Proteins in Alzheimer Disease , 2006, Journal of neuropathology and experimental neurology.

[10]  Zhi-Qin Xi,et al.  HSPBAP1 is found extensively in the anterior temporal neocortex of patients with intractable epilepsy , 2007, Synapse.

[11]  Xihong Lin,et al.  Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed Models , 2007, Biometrics.

[12]  M. McCarthy,et al.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.

[13]  Dawei Liu,et al.  Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models , 2008, BMC Bioinformatics.

[14]  Xihong Lin,et al.  A powerful and flexible multilocus association test for quantitative traits. , 2008, American journal of human genetics.

[15]  V. Pankratz,et al.  Genetic variation in PCDH11X is associated with susceptibility to late-onset Alzheimer's disease , 2009, Nature Genetics.

[16]  M. Eileen Dolan,et al.  Chemotherapeutic drug susceptibility associated SNPs are enriched in expression quantitative trait loci , 2010, Proceedings of the National Academy of Sciences.

[17]  N. Cox,et al.  Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS , 2010, PLoS genetics.

[18]  Deanne M. Taylor,et al.  Powerful SNP-set analysis for case-control genome-wide association studies. , 2010, American journal of human genetics.

[19]  P. D. De Deyn,et al.  Choroidal Proteins Involved in Cerebrospinal Fluid Production may be Potential Drug Targets for Alzheimer’s Disease Therapy , 2011, Perspectives in medicinal chemistry.

[20]  M. Hutz,et al.  Genetic Influences on Alzheimer’s Disease: Evidence of Interactions Between the Genes APOE, APOC1 and ACE in a Sample Population from the South of Brazil , 2011, Neurochemical Research.

[21]  D. Selkoe Alzheimer's disease. , 2011, Cold Spring Harbor perspectives in biology.

[22]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[23]  H. Soares,et al.  Genome-wide association study identifies multiple novel loci associated with disease progression in subjects with mild cognitive impairment , 2011, Translational Psychiatry.

[24]  D. Small,et al.  Revisiting the Role of Acetylcholinesterase in Alzheimer’s Disease: Cross-Talk with P-tau and β-Amyloid , 2011, Front. Mol. Neurosci..

[25]  Cognition and neuropathology in aging: multidimensional perspectives from the Rush Religious Orders Study and Rush Memory And Aging Project. , 2011, Current Alzheimer research.

[26]  J. Schneider,et al.  Overview and findings from the rush Memory and Aging Project. , 2012, Current Alzheimer research.

[27]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[28]  Margaret A. Pericak-Vance,et al.  Brain Expression Genome-Wide Association Study (eGWAS) Identifies Human Disease-Associated Variants , 2012, PLoS genetics.

[29]  J. Schneider,et al.  Overview and findings from the religious orders study. , 2012, Current Alzheimer research.

[30]  Jason J. Corneveaux,et al.  A genome-wide scan for common variants affecting the rate of age-related cognitive decline , 2012, Neurobiology of Aging.

[31]  Markus Perola,et al.  Genome-wide association study identifies multiple loci influencing human serum metabolite levels , 2012, Nature Genetics.

[32]  Marylyn D. Ritchie,et al.  A comparison of cataloged variation between International HapMap Consortium and 1000 Genomes Project data , 2012, J. Am. Medical Informatics Assoc..

[33]  Rachael P. Huntley,et al.  Gene Ontology annotation of sequence-specific DNA binding transcription factors: setting the stage for a large-scale curation effort , 2013, Database J. Biol. Databases Curation.

[34]  Seunggeun Lee,et al.  General framework for meta-analysis of rare variants in sequencing association studies. , 2013, American journal of human genetics.

[35]  Iuliana Ionita-Laza,et al.  Sequence kernel association tests for the combined effect of rare and common variants. , 2013, American journal of human genetics.

[36]  Nick C Fox,et al.  Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease , 2013, Nature Genetics.

[37]  Xia Yang,et al.  Sherlock: detecting gene-disease associations by matching patterns of expression QTL and GWAS. , 2013, American journal of human genetics.

[38]  Manolis Kellis,et al.  Alzheimery's disease pathology is associated with early alterations in brain DNA methylation at ANK1, BIN1, RHBDF2 and other loci , 2014, Nature Neuroscience.

[39]  Shuang Feng,et al.  RAREMETAL: fast and powerful meta-analysis for rare variants , 2014, Bioinform..

[40]  Robert C. Green,et al.  Genome-wide association study of the rate of cognitive decline in Alzheimer's disease , 2014, Alzheimer's & Dementia.

[41]  Margaret A. Pericak-Vance,et al.  Genome-Wide Association Meta-analysis of Neuropathologic Features of Alzheimer's Disease and Related Dementias , 2014, PLoS genetics.

[42]  Donald A. Wilson,et al.  At the interface of sensory and motor dysfunctions and Alzheimer's disease , 2015, Alzheimer's & Dementia.

[43]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.

[44]  T. Lehtimäki,et al.  Integrative approaches for large-scale transcriptome-wide association studies , 2015, Nature Genetics.

[45]  Alan M. Kwong,et al.  Next-generation genotype imputation service and methods , 2016, Nature Genetics.

[46]  O. Andreassen,et al.  Association Between Genetic Traits for Immune-Mediated Diseases and Alzheimer Disease. , 2016, JAMA neurology.

[47]  E. Shoubridge,et al.  Identification and functional characterization of a novel MTFMT mutation associated with selective vulnerability of the visual pathway and a mild neurological phenotype , 2017, neurogenetics.

[48]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[49]  Robert M. Maier,et al.  Causal associations between risk factors and common diseases inferred from GWAS summary data , 2017, Nature Communications.

[50]  Chao Zhao,et al.  LncRNA GAS5 inhibits microglial M2 polarization and exacerbates demyelination , 2017, EMBO reports.

[51]  P. Visscher,et al.  Causal associations between risk factors and common diseases inferred from GWAS summary data , 2017, bioRxiv.

[52]  Prioritizing Parkinson’s Disease genes using population-scale transcriptomic data , 2017 .

[53]  Xiang Zhou,et al.  Non-parametric genetic prediction of complex traits with latent Dirichlet process regression models , 2017, Nature Communications.

[54]  Amber L. Couzens,et al.  MARK3-mediated phosphorylation of ARHGEF2 couples microtubules to the actin cytoskeleton to establish cell polarity , 2017, Science Signaling.

[55]  Ellis Patrick,et al.  An xQTL map integrates the genetic architecture of the human brain’s transcriptome and epigenome , 2017, Nature Neuroscience.

[56]  Karen L Mohlke,et al.  Deciphering the Emerging Complexities of Molecular Mechanisms at GWAS Loci. , 2018, American journal of human genetics.

[57]  V. Babenko,et al.  Altered Slc25 family gene expression as markers of mitochondrial dysfunction in brain regions under experimental mixed anxiety/depression-like disorder , 2018, BMC Neuroscience.

[58]  Jeffery M. Meyer,et al.  A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer , 2018, Nature Genetics.

[59]  Lauren S. Mogil,et al.  Genetically regulated gene expression underlies lipid traits in Hispanic cohorts , 2019 .

[60]  David A Bennett,et al.  Religious Orders Study and Rush Memory and Aging Project. , 2018, Journal of Alzheimer's disease : JAD.

[61]  O. Chiba-Falek,et al.  The effects of the TOMM40 poly-T alleles on Alzheimer's disease phenotypes , 2018, Alzheimer's & Dementia.

[62]  Hae Kyung Im,et al.  Genetic architecture of gene expression traits across diverse populations , 2018, bioRxiv.

[63]  R. Marioni,et al.  GWAS on family history of Alzheimer’s disease , 2018, bioRxiv.

[64]  Timothy J. Hohman,et al.  Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk , 2019, Nature Genetics.

[65]  Towfique Raj,et al.  Prioritizing Parkinson’s disease genes using population-scale transcriptomic data , 2017, Nature Communications.

[66]  N. Sen,et al.  Tauopathy: A common mechanism for neurodegeneration and brain aging , 2019, Mechanisms of Ageing and Development.

[67]  Francis Chee Kuan Tan,et al.  Distinct roles of GRIN2A and GRIN2B variants in neurological conditions , 2019, F1000Research.

[68]  Yi Yang,et al.  CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies , 2019, bioRxiv.

[69]  Yi Yang,et al.  CoMM: A Collaborative Mixed Model That Integrates GWAS and eQTL Data Sets to Investigate the Genetic Architecture of Complex Traits , 2019, Bioinformatics and biology insights.

[70]  Michael P Epstein,et al.  TIGAR: An Improved Bayesian Tool for Transcriptomic Data Imputation Enhances Gene Mapping of Complex Traits. , 2019, American journal of human genetics.

[71]  Lauren S. Mogil,et al.  Genetically regulated gene expression underlies lipid traits in Hispanic cohorts , 2018, bioRxiv.

[72]  Joanne C. Beer,et al.  Population-based genome-wide association study of cognitive decline in older adults free of dementia: identification of a novel locus for the attention domain , 2019, Neurobiology of Aging.

[73]  Jin Liu,et al.  CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies. , 2019, Bioinformatics.

[74]  Timothy J. Hohman,et al.  Cross-Species Analyses Identify Dlgap2 as a Regulator of Age-Related Cognitive Decline and Alzheimer’s Dementia , 2020, Cell reports.

[75]  A. Etkin,et al.  Driving Progress in Posttraumatic Stress Disorder Biomarkers , 2020, Biological Psychiatry.

[76]  Zheng Sun,et al.  Nuclear Receptor Coactivators (NCOAs) and Corepressors (NCORs) in the Brain , 2020, Endocrinology.

[77]  Saijuan Chen,et al.  Yolk sac-derived Pdcd11-positive cells modulate zebrafish microglia differentiation through the NF-κB-Tgfβ1 pathway , 2020, Cell Death & Differentiation.

[78]  On the cross-population generalizability of gene expression prediction models , 2020, PLoS genetics.

[79]  M. Torres-Ramos,et al.  Aryl Hydrocarbon Receptor in Post-Mortem Hippocampus and in Serum from Young, Elder, and Alzheimer’s Patients , 2020, International journal of molecular sciences.

[80]  D. Komander,et al.  A novel USP30 inhibitor recapitulates genetic loss of USP30 and sets the trigger for PINK1-PARKIN amplification of mitochondrial ubiquitylation , 2020, bioRxiv.

[81]  Justin M. Luningham,et al.  Bayesian Genome-wide TWAS method to leverage both cis- and trans- eQTL information through summary statistics , 2020, bioRxiv.