Explaining the disease phenotype of intergenic SNP through predicted long range regulation

Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes.

[1]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[2]  Yusuke Nakamura,et al.  An intronic SNP in a RUNX1 binding site of SLC22A4, encoding an organic cation transporter, is associated with rheumatoid arthritis , 2003, Nature Genetics.

[3]  Shane J. Neph,et al.  Systematic Localization of Common Disease-Associated Variation in Regulatory DNA , 2012, Science.

[4]  D. Koller,et al.  Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals , 2013, Genome research.

[5]  John D. Storey,et al.  Mapping the Genetic Architecture of Gene Expression in Human Liver , 2008, PLoS biology.

[6]  R. Krumlauf,et al.  Hoxb1 Enhancer and Control of Rhombomere 4 Expression: Complex Interplay between PREP1-PBX1-HOXB1 Binding Sites , 2005, Molecular and Cellular Biology.

[7]  Nathaniel D. Heintzman,et al.  9p21 DNA variants associated with Coronary Artery Disease impair IFNγ signaling response , 2011, Nature.

[8]  Ryan D. Hernandez,et al.  Meta-analysis of Genome-wide Association Studies of Asthma In Ethnically Diverse North American Populations , 2011, Nature Genetics.

[9]  Andrew D. Johnson,et al.  Bmc Medical Genetics an Open Access Database of Genome-wide Association Results , 2009 .

[10]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.

[11]  Lijun Cheng,et al.  Genetic control of individual differences in gene-specific methylation in human brain. , 2010, American journal of human genetics.

[12]  Katsushi Tokunaga,et al.  Genome-wide association database developed in the Japanese Integrated Database Project , 2009, Journal of Human Genetics.

[13]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[14]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[15]  Margaret B. Fish,et al.  Disruption of autoregulatory feedback by a mutation in a remote, ultraconserved PAX6 enhancer causes aniridia. , 2013, American journal of human genetics.

[16]  Gabor T. Marth,et al.  Integrative Annotation of Variants from 1092 Humans: Application to Cancer Genomics , 2013, Science.

[17]  B. L,et al.  The accessible chromatin landscape of the human genome , 2016 .

[18]  Yan Li,et al.  A high-resolution map of three-dimensional chromatin interactome in human cells , 2013, Nature.

[19]  Bernhard Horsthemke,et al.  Leveraging Cross-Species Transcription Factor Binding Site Patterns: From Diabetes Risk Loci to Disease Mechanisms , 2014, Cell.

[20]  Nathan C. Sheffield,et al.  The accessible chromatin landscape of the human genome , 2012, Nature.

[21]  Hongkai Ji,et al.  A genome-scale analysis of the cis-regulatory circuitry underlying sonic hedgehog-mediated patterning of the mammalian limb. , 2008, Genes & development.

[22]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[23]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[24]  S. Gabriel,et al.  Risk alleles for multiple sclerosis identified by a genomewide study. , 2007, The New England journal of medicine.

[25]  Buhm Han,et al.  Chromatin marks identify critical cell types for fine mapping complex trait variants , 2012 .

[26]  K. Tan,et al.  Global view of enhancer–promoter interactome in human cells , 2014, Proceedings of the National Academy of Sciences.

[27]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[28]  S. Sell,et al.  Identification of intergenic trans-regulatory RNAs containing a disease-linked SNP sequence and targeting cell cycle progression/differentiation pathways in multiple common human disorders , 2009, Cell cycle.

[29]  N. Campbell Genetic association database , 2004, Nature Reviews Genetics.

[30]  Raymond K. Auerbach,et al.  Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation , 2012, Cell.

[31]  B. Stranger,et al.  Progress and Promise of Genome-Wide Association Studies for Human Complex Trait Genetics , 2011, Genetics.

[32]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[33]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[34]  Tanya M. Teslovich,et al.  Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility , 2014, Nature Genetics.

[35]  A. Silman,et al.  Allelic markers close to prolactin are associated with HLA-DRB1 susceptibility alleles among women with rheumatoid arthritis and systemic lupus erythematosus. , 1997, Arthritis and rheumatism.

[36]  M. Ota,et al.  Identification of Single Nucleotide Polymorphisms Regulating Peripheral Blood mRNA Expression with Genome-Wide Significance: An eQTL Study in the Japanese Population , 2013, PloS one.

[37]  Christopher D. Brown,et al.  Identification, Replication, and Functional Fine-Mapping of Expression Quantitative Trait Loci in Primary Human Liver Tissue , 2011, PLoS genetics.

[38]  M. Rosenfeld,et al.  Enhancers as non-coding RNA transcription units: recent insights and future perspectives , 2016, Nature Reviews Genetics.

[39]  Ivan Ovcharenko,et al.  Identifying causal regulatory SNPs in ChIP-seq enhancers , 2014, Nucleic acids research.

[40]  M C Mossing,et al.  Upstream operators enhance repression of the lac promoter. , 1986, Science.

[41]  Anbupalam Thalamuthu,et al.  TRAF1-C5 as a risk locus for rheumatoid arthritis--a genomewide study. , 2007, The New England journal of medicine.

[42]  C. Carlson,et al.  Meta-analysis of new genome-wide association studies of colorectal cancer risk , 2011, Human Genetics.

[43]  Michael Q. Zhang,et al.  Genome-wide map of regulatory interactions in the human genome , 2014, Genome research.

[44]  M. Khoury,et al.  A navigator for human genome epidemiology , 2008, Nature Genetics.

[45]  Timothy J. Durham,et al.  Systematic analysis of chromatin state dynamics in nine human cell types , 2011, Nature.

[46]  M. Silverberg,et al.  Expression quantitative trait loci analysis identifies associations between genotype and gene expression in human intestine. , 2013, Gastroenterology.

[47]  M. Nóbrega,et al.  An 8q24 gene desert variant associated with prostate cancer risk confers differential in vivo activity to a MYC enhancer. , 2010, Genome research.

[48]  J. Dekker,et al.  Capturing Chromosome Conformation , 2002, Science.

[49]  Joshua D. Larkin,et al.  TNFα signals through specialized factories where responsive coding and miRNA genes are transcribed , 2012, The EMBO journal.

[50]  A. Hofman,et al.  Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function , 2010, Nature Genetics.

[51]  Jun S. Liu,et al.  The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans , 2015, Science.

[52]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[53]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[54]  M Schwab,et al.  Genomics of ADME gene expression: mapping expression quantitative trait loci relevant for absorption, distribution, metabolism and excretion of drugs in human liver , 2011, The Pharmacogenomics Journal.

[55]  L. Kruglyak,et al.  Genetics of global gene expression , 2006, Nature Reviews Genetics.

[56]  Toshihiro Tanaka The International HapMap Project , 2003, Nature.

[57]  P. Cieślik,et al.  HLA-DRB1 and -DQB1 alleles and gene polymorphisms of selected cytokines in systemic lupus erythematosus , 2005, Rheumatology International.

[58]  R. Young,et al.  Super-Enhancers in the Control of Cell Identity and Disease , 2013, Cell.

[59]  Pak Chung Sham,et al.  GWASdb: a database for human genetic variants identified by genome-wide association studies , 2011, Nucleic Acids Res..

[60]  William Stafford Noble,et al.  Fine-scale chromatin interaction maps reveal the cis-regulatory landscape of human lincRNA genes , 2014, Nature Methods.

[61]  Weidong Tian,et al.  Combining Hi-C data with phylogenetic correlation to predict the target genes of distal regulatory elements in human genome , 2013, Nucleic acids research.

[62]  M. Lupien,et al.  Combinatorial effects of multiple enhancer variants in linkage disequilibrium dictate levels of gene expression to confer susceptibility to common traits , 2014, Genome research.

[63]  Tariq Ahmad,et al.  Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47 , 2011, Nature Genetics.

[64]  K. Zhao,et al.  Characterization of genome-wide enhancer-promoter interactions reveals co-expression of interacting genes and modes of higher order chromatin organization , 2012, Cell Research.

[65]  P. Awadalla,et al.  Evidence for additive and interaction effects of host genotype and infection in malaria , 2012, Proceedings of the National Academy of Sciences.

[66]  Michael Miller,et al.  Multiple splice defects in ABCA1 cause low HDL-C in a family with Hypoalphalipoproteinemia and premature coronary disease , 2009, BMC Medical Genetics.

[67]  Hongling Liao,et al.  Long-range enhancers on 8q24 regulate c-Myc , 2010, Proceedings of the National Academy of Sciences.

[68]  E. Liu,et al.  An Oestrogen Receptor α-bound Human Chromatin Interactome , 2009, Nature.