Simultaneous inference of phenotype-associated genes and relevant tissues from GWAS data via Bayesian integration of multiple tissue-specific gene networks

Although genome-wide association studies (GWAS) have successfully identified thousands of genomic loci associated with hundreds of complex traits in the past decade, the debate about such problems as missing heritability and weak interpretability has been appealing for effective computational methods to facilitate the advanced analysis of the vast volume of existing and anticipated genetic data. Towards this goal, gene-level integrative GWAS analysis with the assumption that genes associated with a phenotype tend to be enriched in biological gene sets or gene networks has recently attracted much attention, due to such advantages as straightforward interpretation, less multiple testing burdens, and robustness across studies. However, existing methods in this category usually exploit non-tissue-specific gene networks and thus lack the ability to utilize informative tissue-specific characteristics. To overcome this limitation, we proposed a Bayesian approach called SIGNET (Simultaneously Inference of GeNEs and Tissues) to integrate GWAS data and multiple tissue-specific gene networks for the simultaneous inference of phenotype-associated genes and relevant tissues. Through extensive simulation studies, we showed the effectiveness of our method in finding both associated genes and relevant tissues for a phenotype. In applications to real GWAS data of 14 complex phenotypes, we demonstrated the power of our method in both deciphering genetic basis and discovering biological insights of a phenotype. With this understanding, we expect to see SIGNET as a valuable tool for integrative GWAS analysis, thereby boosting the prevention, diagnosis, and treatment of human inherited diseases and eventually facilitating precision medicine.

[1]  G. Firestein Evolving concepts of rheumatoid arthritis , 2003, Nature.

[2]  P. Visscher,et al.  Five years of GWAS discovery. , 2012, American journal of human genetics.

[3]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[4]  Gérard Chollet,et al.  A Markov random field based multi-band model , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[5]  Dennis Velakoulis,et al.  Structural brain imaging evidence for multiple pathological processes at different stages of brain development in schizophrenia. , 2005, Schizophrenia bulletin.

[6]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[7]  Luonan Chen,et al.  Coexpression network analysis in chronic hepatitis B and C hepatic lesions reveals distinct patterns of disease progression to hepatocellular carcinoma. , 2012, Journal of molecular cell biology.

[8]  Y. Mo,et al.  Role of the lncRNA-p53 regulatory network in cancer. , 2014, Journal of molecular cell biology.

[9]  Jing Cui,et al.  Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci , 2010, Nature Genetics.

[10]  Daniel Marbach,et al.  Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics , 2016, PLoS Comput. Biol..

[11]  T. Meehan,et al.  An atlas of active enhancers across human cell types and tissues , 2014, Nature.

[12]  Hongyu Zhao,et al.  GPA: A Statistical Approach to Prioritizing GWAS Results by Integrating Pleiotropy and Annotation , 2014, PLoS genetics.

[13]  J. Hirschhorn,et al.  Biological interpretation of genome-wide association studies using predicted gene functions , 2015, Nature Communications.

[14]  E. Furlong,et al.  Transcription factors: from enhancer binding to developmental control , 2012, Nature Reviews Genetics.

[15]  Tanya M. Teslovich,et al.  Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes , 2012, Nature Genetics.

[16]  S. Saha,et al.  A systematic review of mortality in schizophrenia: is the differential mortality gap worsening over time? , 2007, Archives of general psychiatry.

[17]  Simon C. Potter,et al.  Genome-wide Association Analysis Identifies 14 New Risk Loci for Schizophrenia , 2013, Nature Genetics.

[18]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[19]  S. Danese,et al.  Ulcerative colitis. , 2011, The New England journal of medicine.

[20]  S. Siris,et al.  Implications of normal brain development for the pathogenesis of schizophrenia. , 1988, Archives of general psychiatry.

[21]  Daniel S. Himmelstein,et al.  Understanding multicellular function and disease with human tissue-specific networks , 2015, Nature Genetics.

[22]  M. Vidal,et al.  Selecting causal genes from genome-wide association studies via functionally coherent subnetworks , 2014, Nature Methods.

[23]  Tariq Ahmad,et al.  Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47 , 2011, Nature Genetics.

[24]  Tanya M. Teslovich,et al.  Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index , 2010 .

[25]  M. Sospedra,et al.  Immunology of Multiple Sclerosis , 2016, Seminars in Neurology.

[26]  J. Nigg,et al.  Functional and genomic context in pathway analysis of GWAS data. , 2014, Trends in genetics : TIG.

[27]  T. Ohmori,et al.  Treating inflammatory bowel disease by adsorptive leucocytapheresis: a desire to treat without drugs. , 2014, World journal of gastroenterology.

[28]  David Z. Chen,et al.  Architecture of the human regulatory network derived from ENCODE data , 2012, Nature.

[29]  Daniel L. Koller,et al.  Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture , 2012, Nature Genetics.

[30]  M. Stephens,et al.  Genome-wide Efficient Mixed Model Analysis for Association Studies , 2012, Nature Genetics.

[31]  David C. Wilson,et al.  Genome-wide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease , 2016, Nature Genetics.

[32]  Daniel Marbach,et al.  Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases , 2016, Nature Methods.

[33]  R. Jiang Walking on multiple disease-gene networks to prioritize candidate genes. , 2015, Journal of molecular cell biology.

[34]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[35]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[36]  F. Rasmussen,et al.  Height and body mass index in young adulthood and risk of schizophrenia: a longitudinal study of 1 347 520 Swedish men , 2007, Acta psychiatrica Scandinavica.

[37]  Joseph K. Pickrell Joint analysis of functional genomic data and genome-wide association studies of 18 human traits , 2013, bioRxiv.

[38]  Sha Cao,et al.  Elucidation of drivers of high-level production of lactates throughout a cancer development. , 2015, Journal of molecular cell biology.

[39]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[40]  Ralf Herwig,et al.  ConsensusPathDB: toward a more complete picture of cell biology , 2010, Nucleic Acids Res..

[41]  Christoph Lange,et al.  Integrated Pathway-Based Approach Identifies Association between Genomic Regions at CTCF and CACNB2 and Schizophrenia , 2014, PLoS genetics.

[42]  D. Clayton,et al.  Genome-wide association study and meta-analysis finds over 40 loci affect risk of type 1 diabetes , 2009, Nature Genetics.

[43]  Hongyu Zhao,et al.  A MARKOV RANDOM FIELD-BASED APPROACH TO CHARACTERIZING HUMAN BRAIN DEVELOPMENT USING SPATIAL-TEMPORAL TRANSCRIPTOME DATA. , 2015, The annals of applied statistics.

[44]  Michael Q. Zhang,et al.  Tissue-specific Regulatory Elements in Mammalian Promoters: Supplementary Information 1 Transcripts and Promoters under Tissue-specific Regulation , 2022 .

[45]  Tanya M. Teslovich,et al.  Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways , 2012, Nature Genetics.

[46]  Daphne Koller,et al.  Sharing and Specificity of Co-expression Networks across 35 Human Tissues , 2014, PLoS Comput. Biol..

[47]  C. Konradi,et al.  Hippocampal neurons in schizophrenia , 2002, Journal of Neural Transmission.

[48]  Tariq Ahmad,et al.  Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci , 2010, Nature Genetics.

[49]  James G. Scott,et al.  Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables , 2012, 1205.0310.

[50]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[51]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[52]  Christina Kendziorski,et al.  On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data , 2001, J. Comput. Biol..

[53]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[54]  Inês Barroso,et al.  Meta-Analysis of Genome-Wide Scans for Human Adult Stature Identifies Novel Loci and Associations with Measures of Skeletal Frame Size , 2009, PLoS genetics.

[55]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[56]  Judy H. Cho,et al.  Incorporating Biological Pathways via a Markov Random Field Model in Genome-Wide Association Studies , 2011, PLoS genetics.

[57]  C. FordAlexander,et al.  ULCERATIVE colitis. , 1997, Journal of the American Medical Association.

[58]  Hongyu Zhao,et al.  A Markov random field-based approach for joint estimation of differentially expressed genes in mouse transcriptome data , 2016, Statistical applications in genetics and molecular biology.

[59]  P. Khaitovich,et al.  De novo identification and quantification of single amino-acid variants in human brain. , 2014, Journal of molecular cell biology.

[60]  Shane J. Neph,et al.  Systematic Localization of Common Disease-Associated Variation in Regulatory DNA , 2012, Science.

[61]  P. Visscher,et al.  A plethora of pleiotropy across complex traits , 2016, Nature Genetics.

[62]  F. Dhombres,et al.  Representation of rare diseases in health information systems: The orphanet approach to serve a wide range of end users , 2012, Human mutation.

[63]  J. Manson,et al.  Body-mass index and mortality among adults with incident type 2 diabetes. , 2014, The New England journal of medicine.

[64]  B. Vainer,et al.  Upregulation of Interleukin-12 and -17 in Active Inflammatory Bowel Disease , 2003, Scandinavian journal of gastroenterology.

[65]  Markus F. Neurath,et al.  Cytokines in inflammatory bowel disease , 2014, Nature Reviews Immunology.

[66]  K. Lange,et al.  Prioritizing GWAS results: A review of statistical methods and recommendations for their application. , 2010, American journal of human genetics.

[67]  M. Kronenberg,et al.  Activation of natural killer T cells by α-galactosylceramide treatment prevents the onset and recurrence of autoimmune Type 1 diabetes , 2001, Nature Medicine.

[68]  Damian Szklarczyk,et al.  The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible , 2016, Nucleic Acids Res..

[69]  C. Bogardus,et al.  The natural history of insulin secretory dysfunction and insulin resistance in the pathogenesis of type 2 diabetes mellitus. , 1999, The Journal of clinical investigation.

[70]  M. Landray,et al.  The effects of lowering LDL cholesterol with simvastatin plus ezetimibe in patients with chronic kidney disease (Study of Heart and Renal Protection): a randomised placebo-controlled trial , 2011, The Lancet.

[71]  Steven M. Tommasini,et al.  Integrating GWAS and Co-expression Network Data Identifies Bone Mineral Density Genes SPTBN1 and MARK3 and an Osteoblast Functional Module. , 2017, Cell systems.

[72]  M. Daly,et al.  Genetic Mapping in Human Disease , 2008, Science.

[73]  Ricardo Villamarín-Salomón,et al.  ClinVar: public archive of interpretations of clinically relevant variants , 2015, Nucleic Acids Res..

[74]  W. Wong,et al.  Modeling gene regulation from paired expression and chromatin accessibility data , 2017, Proceedings of the National Academy of Sciences.

[75]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[76]  Manolis Kellis,et al.  Joint Bayesian inference of risk variants and tissue-specific epigenomic enrichments across multiple complex human diseases , 2016, Nucleic acids research.

[77]  D. Duick,et al.  Identifying prediabetes using fasting insulin levels. , 2010, Endocrine practice : official journal of the American College of Endocrinology and the American Association of Clinical Endocrinologists.

[78]  Ayellet V. Segrè,et al.  Hundreds of variants clustered in genomic loci and biological pathways affect human height , 2010, Nature.

[79]  Ishwor Thapa,et al.  Coexpression Network Analysis of miRNA-142 Overexpression in Neuronal Cells , 2015, BioMed research international.

[80]  Judy H. Cho,et al.  Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations , 2015, Nature Genetics.

[81]  Simon C. Potter,et al.  Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis , 2011, Nature.

[82]  Morgan C. Giddings,et al.  Defining functional DNA elements in the human genome , 2014, Proceedings of the National Academy of Sciences.

[83]  Jin Liu,et al.  EPS: an empirical Bayes approach to integrating pleiotropy and tissue-specific information for prioritizing risk genes , 2016, Bioinform..

[84]  Yong-Gang Yao,et al.  SZDB: A Database for Schizophrenia Genetic Research , 2016, Schizophrenia bulletin.

[85]  C. Mulder,et al.  Acute experimental colitis and human chronic inflammatory diseases share expression of inflammation‐related genes with conserved Ets2 binding sites , 2009, Inflammatory bowel diseases.

[86]  Eric E Schadt,et al.  Multi-tissue coexpression networks reveal unexpected subnetworks associated with disease. , 2009 .

[87]  Dirk Eddelbuettel,et al.  Rcpp: Seamless R and C++ Integration , 2011 .

[88]  P. Visscher,et al.  10 Years of GWAS Discovery: Biology, Function, and Translation. , 2017, American journal of human genetics.

[89]  N. Teich,et al.  Neurologic manifestations of ulcerative colitis , 2007, European journal of neurology.

[90]  Wei Pan,et al.  Bayesian Joint Modeling of Multiple Gene Networks and Diverse Genomic Data to Identify Target Genes of a Transcription Factor. , 2012, The annals of applied statistics.

[91]  E. Ashley Towards precision medicine , 2016, Nature Reviews Genetics.

[92]  J. Schölmerich,et al.  Extraintestinal manifestations and complications in IBD , 2013, Nature Reviews Gastroenterology &Hepatology.

[93]  Y. Taketani,et al.  Upregulation of Interleukin‐8 by Hypoxia in Human Ovaries , 2003, American journal of reproductive immunology.

[94]  Mark I. McCarthy,et al.  A Central Role for GRB10 in Regulation of Islet Function in Man , 2014, PLoS genetics.