Integrating predicted transcriptome from multiple tissues improves association detection

Integration of genome-wide association studies (GWAS) and expression quantitative trait loci (eQTL) studies is needed to improve our understanding of the biological mechanisms underlying GWAS hits, and our ability to identify therapeutic targets. Gene-level association test methods such as PrediXcan can prioritize candidate targets. However, limited eQTL sample sizes and absence of relevant developmental and disease context restricts our ability to detect associations. Here we propose an efficient statistical method that leverages the substantial sharing of eQTLs across tissues and contexts to improve our ability to identify potential target genes: MulTiXcan. MulTiXcan integrates evidence across multiple panels while taking into account their correlation. We apply our method to a broad set of complex traits available from the UK Biobank and show that we can detect a larger set of significantly associated genes than using each panel separately. To improve applicability, we developed an extension to work on summary statistics: S-MulTiXcan, which we show yields highly concordant results with the individual level version. Results from our analysis as well as software and necessary resources to apply our method are publicly available. Author summary We develop a new method, MulTiXcan, to test the effect of gene expression regulation on complex traits, integrating information available across multiple tissue studies. We show this approach has higher power than traditional single-tissue methods. We extend this method to use only summary-statistics from public GWAS. We apply these methods to over 200 complex traits available in the UK Biobank cohort, and 100 complex traits from public GWAS and discuss the findings.

[1]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[2]  Jennifer G. Robinson,et al.  Association of low-frequency and rare coding-sequence variants with blood lipids and coronary heart disease in 56,000 whites and blacks. , 2014, American journal of human genetics.

[3]  Chong Shen,et al.  Genome-wide association study in Han Chinese identifies four new susceptibility loci for coronary artery disease , 2012, Nature Genetics.

[4]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[5]  B. Dahlbäck,et al.  A Novel Human Apolipoprotein (apoM)* , 1999, The Journal of Biological Chemistry.

[6]  David A. Knowles,et al.  RNA splicing is a primary link between genetic variation and disease , 2016, Science.

[7]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.

[8]  Seiichi Mori,et al.  An E2F1-dependent gene expression program that determines the balance between proliferation and cell death. , 2008, Cancer cell.

[9]  Hae Kyung Im,et al.  Integrating tissue specific mechanisms into GWAS summary results , 2017 .

[10]  T. Keil,et al.  An Alu Element–Associated Hypermethylation Variant of the POMC Gene Is Associated with Childhood Obesity , 2012, PLoS genetics.

[11]  P. Elliott,et al.  UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age , 2015, PLoS medicine.

[12]  M. Daly,et al.  Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis , 2013, The Lancet.

[13]  Y. Okada,et al.  Research article Open Access Expression of ADAM15 in rheumatoid synovium: up-regulation by , 2022 .

[14]  A. Goette,et al.  Altered Expression of ADAMs (A D isintegrin A nd M etalloproteinase) in Fibrillating Human Atria , 2002, Circulation.

[15]  P. Campochiaro,et al.  An Adam 15 amplification loop promotes vascular endothelial growth factor-induced ocular neovascularization , 2008 .

[16]  E. Wright,et al.  The sodium/glucose cotransport family SLC5 , 2004, Pflügers Archiv.

[17]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[18]  J. Danesh,et al.  Large-scale association analysis identifies new risk loci for coronary artery disease , 2013 .

[19]  P. Donnelly,et al.  Genome-wide genetic data on ~500,000 UK Biobank participants , 2017, bioRxiv.

[20]  Han Xu,et al.  Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. , 2014, American journal of human genetics.

[21]  S. O'Bryant,et al.  Low-level arsenic exposure, AS3MT gene polymorphism and cardiovascular diseases in rural Texas counties. , 2012, Environmental research.

[22]  Y. Moreau,et al.  Computational tools for prioritizing candidate genes: boosting disease gene discovery , 2012, Nature Reviews Genetics.

[23]  K. Lange,et al.  Prioritizing GWAS results: A review of statistical methods and recommendations for their application. , 2010, American journal of human genetics.

[24]  E. Dermitzakis,et al.  Candidate Causal Regulatory Effects by Integration of Expression QTLs with Complex Trait Genetic Associations , 2010, PLoS genetics.

[25]  C. Gieger,et al.  Genomewide association analysis of coronary artery disease. , 2007, The New England journal of medicine.

[26]  Niku Oksala,et al.  ADAM-9, ADAM-15, and ADAM-17 are upregulated in macrophages in advanced human atherosclerotic plaques in aorta and carotid and femoral arteries—Tampere vascular study , 2009, Annals of medicine.

[27]  N. Wray,et al.  Testing the role of circadian genes in conferring risk for psychiatric disorders , 2014, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[28]  E. Guallar,et al.  Arsenic Exposure and Cardiovascular Disease:An Updated Systematic Review , 2012, Current Atherosclerosis Reports.

[29]  Tian Ge,et al.  Phenome-wide heritability analysis of the UK Biobank , 2016, bioRxiv.

[30]  D. DeMeo,et al.  Integration of genomic and genetic approaches implicates IREB2 as a COPD susceptibility gene. , 2009, American journal of human genetics.

[31]  A. Ridley,et al.  Implantation of the human embryo requires Rac1-dependent endometrial stromal cell migration , 2008, Proceedings of the National Academy of Sciences.

[32]  T. Gridley Notch signaling in vascular development and physiology , 2007, Development.

[33]  Eleazar Eskin,et al.  Local genetic effects on gene expression across 44 human tissues , 2016, bioRxiv.

[34]  Tanya M. Teslovich,et al.  Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes , 2012, Nature Genetics.

[35]  N. Cox,et al.  Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS , 2010, PLoS genetics.

[36]  Robert L. Grossman,et al.  Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets , 2014, J. Am. Medical Informatics Assoc..