Identification of therapeutic targets from genetic association studies using hierarchical component analysis

Background Mapping disease-associated genetic variants to complex disease pathophysiology is a major challenge in translating findings from genome-wide association studies into novel therapeutic opportunities. The difficulty lies in our limited understanding of how phenotypic traits arise from non-coding genetic variants in highly organized biological systems with heterogeneous gene expression across cells and tissues. Results We present a novel strategy, called GWAS component analysis, for transferring disease associations from single-nucleotide polymorphisms to co-expression modules by stacking models trained using reference genome and tissue-specific gene expression data. Application of this method to genome-wide association studies of blood cell counts confirmed that it could detect gene sets enriched in expected cell types. In addition, coupling of our method with Bayesian networks enables GWAS components to be used to discover drug targets. Conclusions We tested genome-wide associations of four disease phenotypes, including age-related macular degeneration, Crohn’s disease, ulcerative colitis and rheumatoid arthritis, and demonstrated the proposed method could select more functional genes than S-PrediXcan, the previous single-step model for predicting gene-level associations from SNP-level associations.

[1]  E. Mazzon,et al.  5-Lipoxygenase modulates colitis through the regulation of adhesion molecule expression and neutrophil migration , 2005, Laboratory Investigation.

[2]  C. Greenwood,et al.  Genetic architecture: the shape of the genetic contribution to human traits and disease , 2017, Nature Reviews Genetics.

[3]  N. Wray,et al.  Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance components analysis , 2015, Nature Genetics.

[4]  D. Koller,et al.  Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals , 2013, Genome research.

[5]  Judy H. Cho,et al.  Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations , 2015, Nature Genetics.

[6]  Marc Hafner,et al.  L1000CDS2: LINCS L1000 characteristic direction signatures search engine , 2016, npj Systems Biology and Applications.

[7]  Kathleen M Jagodnik,et al.  Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd , 2016, Nature Communications.

[8]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[9]  M. Daly,et al.  Genetic and Epigenetic Fine-Mapping of Causal Autoimmune Disease Variants , 2014, Nature.

[10]  A. Mantovani,et al.  The macrophage tetraspan MS4A4A enhances dectin-1-dependent NK cell–mediated resistance to metastasis , 2019, Nature Immunology.

[11]  J. Gilbert,et al.  Complement Factor H Variant Increases the Risk of Age-Related Macular Degeneration , 2005, Science.

[12]  P. Visscher,et al.  A versatile gene-based test for genome-wide association studies. , 2010, American journal of human genetics.

[13]  J. Zhu,et al.  An integrative genomics approach to the reconstruction of gene networks in segregating populations , 2004, Cytogenetic and Genome Research.

[14]  Zhengjin Yang,et al.  Thermally triggered polyrotaxane translational motion helps proton transfer , 2018, Nature Communications.

[15]  J. Satsangi,et al.  The genetics of Crohn's disease. , 2009, Annual review of genomics and human genetics.

[16]  Matthew Stephens,et al.  Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes , 2018, Nature Communications.

[17]  Xiang Zhao,et al.  Pathway-based Analysis Tools for Complex Diseases: A Review , 2014, Genom. Proteom. Bioinform..

[18]  Judy H Cho,et al.  Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis , 2007, Nature Genetics.

[19]  Hiroyuki Kubota,et al.  Trans-Omics: How To Reconstruct Biochemical Networks Across Multiple 'Omic' Layers. , 2016, Trends in biotechnology.

[20]  Weihua Guan,et al.  Meta-Analysis of 23 Type 2 Diabetes Linkage Studies from the International Type 2 Diabetes Linkage Analysis Consortium , 2007, Human Heredity.

[21]  J. Gern The Sequence of the Human Genome , 2001, Science.

[22]  David C. Wilson,et al.  Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease , 2012, Nature.

[23]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[24]  C. Wallace,et al.  Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics , 2013, PLoS genetics.

[25]  Jing Wang,et al.  Gene co-expression network connectivity is an important determinant of selective constraint , 2017, bioRxiv.

[26]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[27]  B. Pasaniuc,et al.  Contrasting the genetic architecture of 30 complex traits from summary association data , 2016, bioRxiv.

[28]  Rachel B. Brem,et al.  Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks , 2008, Nature Genetics.

[29]  Yang I Li,et al.  An Expanded View of Complex Traits: From Polygenic to Omnigenic , 2017, Cell.

[30]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[31]  J. Provis,et al.  Subretinal macrophages produce classical complement activator C1q leading to the progression of focal retinal degeneration , 2018, Molecular Neurodegeneration.

[32]  M. Daly,et al.  Genome-wide association study identifies five novel susceptibility loci for Crohn ' s disease and implicates a role for autophagy in disease pathogenesis , 2009 .

[33]  J. Stinchcombe,et al.  The Relationship between Selection, Network Connectivity, and Regulatory Variation within a Population of Capsella grandiflora , 2017, Genome biology and evolution.

[34]  T. Lehtimäki,et al.  Integrative approaches for large-scale transcriptome-wide association studies , 2015, Nature Genetics.

[35]  Kaanan P. Shah,et al.  A gene-based association method for mapping traits using reference transcriptome data , 2015, Nature Genetics.

[36]  H. Ovaa,et al.  Necessity of lysophosphatidic acid receptor 1 for development of arthritis. , 2013, Arthritis and rheumatism.

[37]  Daniel Marbach,et al.  Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics , 2016, PLoS Comput. Biol..

[38]  L. Pilarski,et al.  MS4A4A: a novel cell surface marker for M2 macrophages and plasma cells , 2017, Immunology and cell biology.

[39]  Avi Ma'ayan,et al.  The characteristic direction: a geometrical approach to identify differentially expressed genes , 2014, BMC Bioinformatics.

[40]  Evan O. Paull,et al.  Inferring causal molecular networks: empirical assessment through a community-based effort , 2016, Nature Methods.

[41]  Daphne Koller,et al.  Sharing and Specificity of Co-expression Networks across 35 Human Tissues , 2014, PLoS Comput. Biol..

[42]  Judy H. Cho,et al.  A Genome-Wide Association Study Identifies IL23R as an Inflammatory Bowel Disease Gene , 2006, Science.

[43]  P. Kovanen,et al.  Imatinib mesylate inhibits platelet derived growth factor stimulated proliferation of rheumatoid synovial fibroblasts. , 2006, Biochemical and biophysical research communications.

[44]  William J. Astle,et al.  Allelic Landscape of Human Blood Cell Trait Variation and Links , 2016 .

[45]  Tiago J. S. Lopes,et al.  CTen: a web-based platform for identifying enriched cell types from heterogeneous microarray data , 2012, BMC Genomics.

[46]  P. Visscher,et al.  Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets , 2016, Nature Genetics.

[47]  Todd L Edwards,et al.  Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics , 2018, Nature Communications.

[48]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[49]  A. Butte,et al.  Leveraging models of cell regulation and GWAS data in integrative network-based association studies , 2012, Nature Genetics.

[50]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[51]  Steven Wiltshire,et al.  A meta-analysis of four European genome screens (GIFT Consortium) shows evidence for a novel region on chromosome 17p11.2-q22 linked to type 2 diabetes. , 2003, Human molecular genetics.

[52]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[53]  Judith A. Blake,et al.  MGD: the Mouse Genome Database , 2003, Nucleic Acids Res..

[54]  M. Charbonneau,et al.  This information is current as Arthritis Synoviocytes from Patients with Rheumatoid Invadosome-Forming Phenotype of Activation Promotes the Prodestructive Platelet-Derived Growth Factor Receptor , 2016 .

[55]  Teresa J. Feo,et al.  Structural absorption by barbule microstructures of super black bird of paradise feathers , 2018, Nature Communications.

[56]  Rui Chang,et al.  Exploring the Reproducibility of Probabilistic Causal Molecular Network Models> , 2017, PSB.