A Method for Gene-Based Pathway Analysis Using Genomewide Association Study Summary Statistics Reveals Nine New Type 1 Diabetes Associations

Pathway analysis can complement point‐wise single nucleotide polymorphism (SNP) analysis in exploring genomewide association study (GWAS) data to identify specific disease‐associated genes that can be candidate causal genes. We propose a straightforward methodology that can be used for conducting a gene‐based pathway analysis using summary GWAS statistics in combination with widely available reference genotype data. We used this method to perform a gene‐based pathway analysis of a type 1 diabetes (T1D) meta‐analysis GWAS (of 7,514 cases and 9,045 controls). An important feature of the conducted analysis is the removal of the major histocompatibility complex gene region, the major genetic risk factor for T1D. Thirty‐one of the 1,583 (2%) tested pathways were identified to be enriched for association with T1D at a 5% false discovery rate. We analyzed these 31 pathways and their genes to identify SNPs in or near these pathway genes that showed potentially novel association with T1D and attempted to replicate the association of 22 SNPs in additional samples. Replication P‐values were skewed ( P=9.85×10−11 ) with 12 of the 22 SNPs showing P<0.05 . Support, including replication evidence, was obtained for nine T1D associated variants in genes ITGB7 (rs11170466, P=7.86×10−9 ), NRP1 (rs722988, 4.88×10−8 ), BAD (rs694739, 2.37×10−7 ), CTSB (rs1296023, 2.79×10−7 ), FYN (rs11964650, P=5.60×10−7 ), UBE2G1 (rs9906760, 5.08×10−7 ), MAP3K14 (rs17759555, 9.67×10−7 ), ITGB1 (rs1557150, 1.93×10−6 ), and IL7R (rs1445898, 2.76×10−6 ). The proposed methodology can be applied to other GWAS datasets for which only summary level data are available.

[1]  Lincoln Stein,et al.  Reactome: a database of reactions, pathways and biological processes , 2010, Nucleic Acids Res..

[2]  Marina Evangelou,et al.  Two novel pathway analysis methods based on a hierarchical model , 2013, Bioinform..

[3]  C. Wijmenga,et al.  Using genome‐wide pathway analysis to unravel the etiology of complex diseases , 2009, Genetic epidemiology.

[4]  P. Visscher,et al.  A versatile gene-based test for genome-wide association studies. , 2010, American journal of human genetics.

[5]  Tariq Ahmad,et al.  Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci , 2010, Nature Genetics.

[6]  Rehan Qayyum,et al.  A Meta-Analysis and Genome-Wide Association Study of Platelet Count and Mean Platelet Volume in African Americans , 2012, PLoS genetics.

[7]  Kenneth H. Buetow,et al.  PID: the Pathway Interaction Database , 2008, Nucleic Acids Res..

[8]  P. Tam The International HapMap Consortium. The International HapMap Project (Co-PI of Hong Kong Centre which responsible for 2.5% of genome) , 2003 .

[9]  Laurent Gil,et al.  Ensembl 2013 , 2012, Nucleic Acids Res..

[10]  C. Sheridan First integrin inhibitor since Tysabri nears approval for IBD , 2014, Nature Biotechnology.

[11]  C. Hoggart,et al.  Pathway Analysis of GWAS Provides New Insights into Genetic Susceptibility to 3 Inflammatory Diseases , 2009, PloS one.

[12]  R. A. Bailey,et al.  Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes , 2007, Nature Genetics.

[13]  M. Brown,et al.  Promise and pitfalls of the Immunochip , 2011, Arthritis research & therapy.

[14]  Lincoln Stein,et al.  Reactome knowledgebase of human biological pathways and processes , 2008, Nucleic Acids Res..

[15]  D. Finkelstein,et al.  Stability and function of regulatory T cells is maintained by a neuropilin-1–semaphorin-4a axis , 2013, Nature.

[16]  Airat Bekmetjev,et al.  Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16 , 2009, BMC proceedings.

[17]  Elizabeth A. Heron,et al.  The SNP ratio test: pathway analysis of genome-wide association datasets , 2009, Bioinform..

[18]  Marina Evangelou,et al.  Comparison of Methods for Competitive Tests of Pathway Analysis , 2012, PloS one.

[19]  Xi Chen,et al.  An efficient hierarchical generalized linear mixed model for pathway analysis of genome-wide association studies , 2011, Bioinform..

[20]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[21]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[22]  Nicola K. Wilson,et al.  Long-range DNA looping and gene expression analyses identify DEXI as an autoimmune disease candidate gene , 2011, Human molecular genetics.

[23]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[24]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[25]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[26]  Kai Wang,et al.  Pathway-based approaches for analysis of genomewide association studies. , 2007, American journal of human genetics.

[27]  John Whittaker,et al.  Analysis of multiple SNPs in a candidate gene or region , 2008, Genetic epidemiology.

[28]  Richard M. R. Coulson,et al.  T1DBase: update 2011, organization and presentation of large-scale data sets for type 1 diabetes research , 2010, Nucleic Acids Res..

[29]  Helen Schuilenburg,et al.  Information for : Genome-wide association study and meta-analysis indicates that over 40 loci affect risk of type 1 diabetes , 2009 .

[30]  Manuel A. R. Ferreira,et al.  Gene ontology analysis of GWA study data sets provides insights into the biology of bipolar disorder. , 2009, American journal of human genetics.

[31]  P. Rosenberg,et al.  Pathway analysis by adaptive combination of P‐values , 2009, Genetic epidemiology.

[32]  M. Stephens,et al.  Integrated Enrichment Analysis of Variants and Pathways in Genome-Wide Association Studies Indicates Central Role for IL-2 Signaling Genes in Type 1 Diabetes, and Cytokine Signaling Genes in Crohn's Disease , 2013, PLoS genetics.

[33]  D. Schaid,et al.  Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies , 2012, Genetic epidemiology.

[34]  Silke Szymczak,et al.  Genetics and Beyond – The Transcriptome of Human Monocytes and Disease Susceptibility , 2010, PloS one.

[35]  R. Martin,et al.  IL7RA haplotype-associated alterations in cellular immune function and gene expression patterns in multiple sclerosis , 2013, Genes and Immunity.

[36]  Jason S. Mitchell,et al.  Control of α4β7 Integrin Expression and CD4 T Cell Homing by the β1 Integrin Subunit , 2010, The Journal of Immunology.

[37]  Momiao Xiong,et al.  Gene and pathway-based second-wave analysis of genome-wide association studies , 2010, European Journal of Human Genetics.

[38]  David C. Wilson,et al.  Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease , 2012, Nature.

[39]  N. Morton Genetic epidemiology , 1997, International Journal of Obesity.

[40]  Miles Parkes,et al.  Genetic insights into common pathways and complex relationships among immune-mediated diseases , 2013, Nature Reviews Genetics.

[41]  Jiyan Zhang,et al.  Inactivation of BAD by IKK Inhibits TNFα-Induced Apoptosis Independently of NF-κB Activation , 2013, Cell.

[42]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[43]  Sarah Edkins,et al.  Dense genotyping identifies and localizes multiple common and rare variant association signals in celiac disease , 2011, Nature Genetics.

[44]  Silke Schmidt,et al.  Interleukin 7 receptor α chain ( IL7R ) shows allelic and functional association with multiple sclerosis , 2007, Nature Genetics.

[45]  D. Blacker,et al.  Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test , 2013, BMC Genetics.

[46]  Renata C. Geer,et al.  The NCBI BioSystems database , 2009, Nucleic Acids Res..