Integrating enhancer RNA signatures with diverse omics data identifies characteristics of transcription initiation in pancreatic islets

Identifying active regulatory elements and their molecular signatures is critical to understand gene regulatory mechanisms and subsequently better delineating biological mechanisms of complex diseases and traits. Studies have shown that active enhancers can be transcribed into enhancer RNA (eRNA). Here, we identify actively transcribed regulatory elements in human pancreatic islets by generating eRNA profiles using cap analysis of gene expression (CAGE) across 70 islet samples. We identify 9,954 clusters of CAGE tag transcription start sites (TSS) or tag clusters (TCs) in islets, [~]20% of which are islet-specific when compared to CAGE TCs across publicly available tissues. Islet TCs are most enriched to overlap genome wide association study (GWAS) loci for islet-relevant traits such as fasting glucose. We integrated islet CAGE profiles with diverse epigenomic information such as chromatin immunoprecipitation followed by sequencing (ChIP-seq) profiles of five histone modifications and accessible chromatin profiles from the assay for transposase accessible chromatin followed by sequencing (ATAC-seq), to understand how the underlying islet chromatin landscape is associated with TSSs. We identify that ATAC-seq informed transcription factor (TF) binding sites (TF footprint motifs) for the RFX TF family are highly enriched in transcribed regions occurring in enhancer chromatin states, whereas TF footprint motifs for the ETS family are highly enriched in transcribed regions within promoter chromatin states. Using massively parallel reporter assays in a rat pancreatic islet beta cell line, we tested the activity of 3,378 islet CAGE elements and found that 2,279 ([~]67.5%) show significant regulatory activity (5% FDR). We find that TCs within accessible enhancer show higher enrichment to overlap T2D GWAS loci than accessible enhancer annotations alone, suggesting that TC annotations pinpoint active regions within the enhancer chromatin states. This work provides a high-resolution transcriptional regulatory map of human pancreatic islets.

[1]  Stephen C. J. Parker,et al.  Influence of genetic variants on gene expression in human pancreatic islets – implications for type 2 diabetes , 2019, bioRxiv.

[2]  Fabian J. Theis,et al.  MPRAnalyze: statistical framework for massively parallel reporter assays , 2019, bioRxiv.

[3]  Helen E. Parkinson,et al.  The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019 , 2018, Nucleic Acids Res..

[4]  Stephen C. J. Parker,et al.  Cell Specificity of Human Regulatory Annotations and Their Genetic Effects on Gene Expression , 2018, Genetics.

[5]  Anthony J. Payne,et al.  Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps , 2018, Nature Genetics.

[6]  Gad Getz,et al.  Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: A soft clustering analysis , 2018, PLoS medicine.

[7]  Kyle J. Gaulton,et al.  Integration of human pancreatic islet genomic data refines regulatory mechanisms at Type 2 Diabetes susceptibility loci , 2017, bioRxiv.

[8]  Stephen C. J. Parker,et al.  A Type 2 Diabetes–Associated Functional Regulatory Variant in a Pancreatic Islet Enhancer at the ADCY5 Locus , 2017, Diabetes.

[9]  Laura J. Scott,et al.  Genetic regulatory signatures underlying islet gene expression and type 2 diabetes , 2017, Proceedings of the National Academy of Sciences.

[10]  Philippe Froguel,et al.  Decreased STARD10 Expression Is Associated with Defective Insulin Secretion in Humans and Mice , 2017, American journal of human genetics.

[11]  R. Agami,et al.  GRO-seq, A Tool for Identification of Transcripts Regulating Gene Expression. , 2017, Methods in molecular biology.

[12]  T. Mikkelsen,et al.  Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions , 2016, Nature Biotechnology.

[13]  J. van Helden,et al.  RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections , 2016, bioRxiv.

[14]  Laura J. Scott,et al.  The genetic regulatory signature of type 2 diabetes in human skeletal muscle , 2016, Nature Communications.

[15]  Xuhong Song,et al.  Enhancer RNA-driven looping enhances the transcription of the long noncoding RNA DHRS4-AS1, a controller of the DHRS4 gene cluster , 2016, Scientific Reports.

[16]  Mark I. McCarthy,et al.  Transcript Expression Data from Human Islets Links Regulatory Signals from Genome-Wide Association Studies for Type 2 Diabetes and Glycemic Traits to Their Downstream Effectors , 2015, PLoS genetics.

[17]  Yasuyuki Ohkawa,et al.  Agplus: a Rapid and Flexible Tool for Aggregation Plots , 2015, Bioinform..

[18]  Yakir A Reshef,et al.  Partitioning heritability by functional annotation using genome-wide association summary statistics , 2015, Nature Genetics.

[19]  Ji Zhang,et al.  GREGOR: evaluating global enrichment of trait-associated variants in epigenomic features using a systematic, data-driven approach , 2015, Bioinform..

[20]  Stephen Hartley,et al.  QoRTs: a comprehensive toolset for quality control and data processing of RNA-Seq experiments , 2015, BMC Bioinformatics.

[21]  Stephen C. J. Parker,et al.  Motif signatures in stretch enhancers are enriched for disease-associated genetic variants , 2015, Epigenetics & Chromatin.

[22]  M. Andersen,et al.  CNC-bZIP protein Nrf1-dependent regulation of glucose-stimulated insulin secretion. , 2015, Antioxidants & redox signaling.

[23]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[24]  Guillaume J. Filion,et al.  Starcode: sequence clustering based on all-pairs search , 2015, Bioinform..

[25]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[26]  André L. Martins,et al.  Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers , 2014, Nature Genetics.

[27]  Han Xu,et al.  Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. , 2014, American journal of human genetics.

[28]  L. Groop,et al.  Global genomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism , 2014, Proceedings of the National Academy of Sciences.

[29]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[30]  P. Kantoff,et al.  Enhancer RNAs participate in androgen receptor-driven looping that selectively enhances gene activation , 2014, Proceedings of the National Academy of Sciences.

[31]  T. Meehan,et al.  An atlas of active enhancers across human cell types and tissues , 2014, Nature.

[32]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[33]  Christian Fuchsberger,et al.  A common functional regulatory variant at a type 2 diabetes locus upregulates ARAP1 expression in the pancreatic beta cell. , 2014, American journal of human genetics.

[34]  Manolis Kellis,et al.  Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments , 2013, Nucleic acids research.

[35]  Joseph K. Pickrell Joint analysis of functional genomic data and genome-wide association studies of 18 human traits , 2013, bioRxiv.

[36]  R. Young,et al.  Super-Enhancers in the Control of Cell Identity and Disease , 2013, Cell.

[37]  Stephen C. J. Parker,et al.  Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants , 2013, Proceedings of the National Academy of Sciences.

[38]  Howard Y. Chang,et al.  Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position , 2013, Nature Methods.

[39]  J. Stender,et al.  Remodeling of the enhancer landscape during macrophage activation is coupled to enhancer transcription. , 2013, Molecular cell.

[40]  Caleb Webber,et al.  GAT: a simulation framework for testing the association of genomic intervals , 2013, Bioinform..

[41]  C. Glass,et al.  Functional roles of enhancer RNAs for oestrogen-dependent transcriptional activation , 2013, Nature.

[42]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[43]  Łukasz M. Boryń,et al.  Genome-Wide Quantitative Enhancer Activity Maps Identified by STARR-seq , 2013, Science.

[44]  Juan M. Vaquerizas,et al.  DNA-Binding Specificities of Human Transcription Factors , 2013, Cell.

[45]  Buhm Han,et al.  Chromatin marks identify critical cell types for fine mapping complex trait variants , 2012 .

[46]  William Stafford Noble,et al.  Integrative annotation of chromatin elements from ENCODE data , 2012, Nucleic acids research.

[47]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[48]  Marc D. Perry,et al.  ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia , 2012, Genome research.

[49]  Manolis Kellis,et al.  Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions , 2012, Science.

[50]  Manolis Kellis,et al.  ChromHMM: automating chromatin-state discovery and characterization , 2012, Nature Methods.

[51]  Albert J. Vilella,et al.  A high-resolution map of human evolutionary constraint using 29 mammals , 2011, Nature.

[52]  M. Gut,et al.  Transcription initiation platforms and GTF recruitment at tissue-specific enhancers and promoters , 2011, Nature Structural &Molecular Biology.

[53]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[54]  Timothy J. Durham,et al.  "Systematic" , 1966, Comput. J..

[55]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[56]  Timothy J. Durham,et al.  Systematic analysis of chromatin state dynamics in nine human cell types , 2011, Nature.

[57]  B. Bernstein,et al.  Charting histone modifications and the functional organization of mammalian genomes , 2011, Nature Reviews Genetics.

[58]  R. Young,et al.  Histone H3K27ac separates active from poised enhancers and predicts developmental state , 2010, Proceedings of the National Academy of Sciences.

[59]  Mazhar Adli,et al.  Genome-wide chromatin maps derived from limited numbers of hematopoietic progenitors , 2010, Nature Methods.

[60]  G. Kreiman,et al.  Widespread transcription at neuronal activity-regulated enhancers , 2010, Nature.

[61]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[62]  Leighton J. Core,et al.  Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters , 2008, Science.

[63]  A. Krogh,et al.  A code for transcription initiation in mammalian genomes. , 2007, Genome research.

[64]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[65]  T. Mikkelsen,et al.  Genome-wide maps of chromatin state in pluripotent and lineage-committed cells , 2007, Nature.

[66]  Inna Dubchak,et al.  VISTA Enhancer Browser—a database of tissue-specific human enhancers , 2006, Nucleic Acids Res..

[67]  B. Raaka,et al.  Epithelial-to-Mesenchymal Transition Generates Proliferative Human Islet Precursor Cells , 2004, Science.

[68]  A. Butte,et al.  Coordinated reduction of genes of oxidative metabolism in humans with insulin resistance and diabetes: Potential role of PGC1 and NRF1 , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[69]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[70]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .