A Transcription Start Site Map in Human Pancreatic Islets Reveals Functional Regulatory Signatures

Identifying the tissue-specific molecular signatures of active regulatory elements is critical to understand gene regulatory mechanisms. Here, we identify transcription start sites (TSS) using cap analysis of gene expression (CAGE) across 57 human pancreatic islet samples. We identify 9,954 reproducible CAGE tag clusters (TCs), ∼20% of which are islet specific and occur mostly distal to known gene TSS. We integrated islet CAGE data with histone modification and chromatin accessibility profiles to identify epigenomic signatures of transcription initiation. Using a massively parallel reporter assay, we validated the transcriptional enhancer activity for 2,279 of 3,378 (∼68%) tested islet CAGE elements (5% false discovery rate). TCs within accessible enhancers show higher enrichment to overlap type 2 diabetes genome-wide association study (GWAS) signals than existing islet annotations, which emphasizes the utility of mapping CAGE profiles in disease-relevant tissue. This work provides a high-resolution map of transcriptional initiation in human pancreatic islets with utility for dissecting active enhancers at GWAS loci.

[1]  Stephen C. J. Parker,et al.  Genetic variant effects on gene expression in human pancreatic islets and their implications for T2D , 2020, Nature Communications.

[2]  Stephen C. J. Parker,et al.  Single-cell ATAC-Seq in human pancreatic islets and deep learning upscaling of rare cells reveals cell-specific type 2 diabetes regulatory signatures , 2019, Molecular metabolism.

[3]  Inês Cebola Pancreatic Islet Transcriptional Enhancers and Diabetes , 2019, Current Diabetes Reports.

[4]  Stephen C. J. Parker,et al.  Single-cell ATAC-Seq in human pancreatic islets and deep learning upscaling of rare cells reveals cell-specific type 2 diabetes regulatory signatures , 2019, bioRxiv.

[5]  Cosmas D. Arnold,et al.  STARR‐seq and UMI‐STARR‐seq: Assessing Enhancer Activities for Genome‐Wide‐, High‐, and Low‐Complexity Candidate Libraries , 2019, Current protocols in molecular biology.

[6]  J. Kere,et al.  NET-CAGE characterizes the dynamics and topology of human transcribed cis-regulatory elements , 2019, Nature Genetics.

[7]  David Torrents,et al.  Human pancreatic islet three-dimensional chromatin architecture provides insights into the genetics of type 2 diabetes , 2019, Nature Genetics.

[8]  William W. Greenwald,et al.  Subtle changes in chromatin loop contact propensity are associated with differential gene regulation and expression , 2019, Nature Communications.

[9]  E. Birney,et al.  GARFIELD classifies disease-relevant genomic features through integration of functional annotations with association signals , 2019, Nature Genetics.

[10]  Fabian J. Theis,et al.  MPRAnalyze: statistical framework for massively parallel reporter assays , 2019, bioRxiv.

[11]  Stephen C. J. Parker,et al.  Cell Specificity of Human Regulatory Annotations and Their Genetic Effects on Gene Expression , 2018, Genetics.

[12]  Sarah M. Goggin,et al.  High-resolution genome-wide functional dissection of transcriptional regulatory regions and nucleotides in human , 2018, Nature Communications.

[13]  Anthony J. Payne,et al.  Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps , 2018, Nature Genetics.

[14]  J. Kere,et al.  Characterization of the human RFX transcription factor family by regulatory and target gene analysis , 2018, BMC Genomics.

[15]  I. Schor,et al.  The degree of enhancer or promoter activity is reflected by the levels and directionality of eRNA transcription , 2018, Genes & development.

[16]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[17]  Kyle J. Gaulton,et al.  Integration of human pancreatic islet genomic data refines regulatory mechanisms at Type 2 Diabetes susceptibility loci , 2017, bioRxiv.

[18]  Stephen C. J. Parker,et al.  A Type 2 Diabetes–Associated Functional Regulatory Variant in a Pancreatic Islet Enhancer at the ADCY5 Locus , 2017, Diabetes.

[19]  Laura J. Scott,et al.  Genetic regulatory signatures underlying islet gene expression and type 2 diabetes , 2017, Proceedings of the National Academy of Sciences.

[20]  R. Agami,et al.  GRO-seq, A Tool for Identification of Transcripts Regulating Gene Expression. , 2017, Methods in molecular biology.

[21]  T. Mikkelsen,et al.  Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions , 2016, Nature Biotechnology.

[22]  J. van Helden,et al.  RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections , 2016, bioRxiv.

[23]  Laura J. Scott,et al.  The genetic regulatory signature of type 2 diabetes in human skeletal muscle , 2016, Nature Communications.

[24]  P. Unger,et al.  ZBTB16: a novel sensitive and specific biomarker for yolk sac tumor , 2016, Modern Pathology.

[25]  David J. Arenillas,et al.  JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles , 2015, Nucleic Acids Res..

[26]  Mark I. McCarthy,et al.  Transcript Expression Data from Human Islets Links Regulatory Signals from Genome-Wide Association Studies for Type 2 Diabetes and Glycemic Traits to Their Downstream Effectors , 2015, PLoS genetics.

[27]  Yasuyuki Ohkawa,et al.  Agplus: a Rapid and Flexible Tool for Aggregation Plots , 2015, Bioinform..

[28]  Yakir A Reshef,et al.  Partitioning heritability by functional annotation using genome-wide association summary statistics , 2015, Nature Genetics.

[29]  Stephen Hartley,et al.  QoRTs: a comprehensive toolset for quality control and data processing of RNA-Seq experiments , 2015, BMC Bioinformatics.

[30]  Stephen C. J. Parker,et al.  Motif signatures in stretch enhancers are enriched for disease-associated genetic variants , 2015, Epigenetics & Chromatin.

[31]  M. Andersen,et al.  CNC-bZIP protein Nrf1-dependent regulation of glucose-stimulated insulin secretion. , 2015, Antioxidants & redox signaling.

[32]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[33]  Guillaume J. Filion,et al.  Starcode: sequence clustering based on all-pairs search , 2015, Bioinform..

[34]  André L. Martins,et al.  Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers , 2014, Nature Genetics.

[35]  Han Xu,et al.  Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. , 2014, American journal of human genetics.

[36]  L. Groop,et al.  Global genomic and transcriptomic analysis of human pancreatic islets reveals novel genes influencing glucose metabolism , 2014, Proceedings of the National Academy of Sciences.

[37]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[38]  C. Spencer,et al.  Biological Insights From 108 Schizophrenia-Associated Genetic Loci , 2014, Nature.

[39]  T. Meehan,et al.  An atlas of active enhancers across human cell types and tissues , 2014, Nature.

[40]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[41]  Mark I. McCarthy,et al.  Pancreatic islet enhancer clusters enriched in type 2 diabetes risk–associated variants , 2013, Nature Genetics.

[42]  Manolis Kellis,et al.  Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments , 2013, Nucleic acids research.

[43]  Joseph K. Pickrell Joint analysis of functional genomic data and genome-wide association studies of 18 human traits , 2013, bioRxiv.

[44]  C. Wallace,et al.  Bayesian Test for Colocalisation between Pairs of Genetic Association Studies Using Summary Statistics , 2013, PLoS genetics.

[45]  Piero Carninci,et al.  Detecting expressed genes using CAGE. , 2014, Methods in molecular biology.

[46]  R. Young,et al.  Super-Enhancers in the Control of Cell Identity and Disease , 2013, Cell.

[47]  Stephen C. J. Parker,et al.  Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants , 2013, Proceedings of the National Academy of Sciences.

[48]  Howard Y. Chang,et al.  Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position , 2013, Nature Methods.

[49]  Caleb Webber,et al.  GAT: a simulation framework for testing the association of genomic intervals , 2013, Bioinform..

[50]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[51]  Łukasz M. Boryń,et al.  Genome-Wide Quantitative Enhancer Activity Maps Identified by STARR-seq , 2013, Science.

[52]  Juan M. Vaquerizas,et al.  DNA-Binding Specificities of Human Transcription Factors , 2013, Cell.

[53]  Buhm Han,et al.  Chromatin marks identify critical cell types for fine mapping complex trait variants , 2012 .

[54]  William Stafford Noble,et al.  Integrative annotation of chromatin elements from ENCODE data , 2012, Nucleic acids research.

[55]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[56]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[57]  Manolis Kellis,et al.  Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions , 2012, Science.

[58]  Marc D. Perry,et al.  ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia , 2012, Genome research.

[59]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[60]  Manolis Kellis,et al.  ChromHMM: automating chromatin-state discovery and characterization , 2012, Nature Methods.

[61]  T. Mikkelsen,et al.  Rapid dissection and model-based optimization of inducible enhancers in human cells using a massively parallel reporter assay , 2012, Nature Biotechnology.

[62]  Michael F. Melgar,et al.  Discovery of active enhancers through bidirectional expression of short transcripts , 2011, Genome Biology.

[63]  Albert J. Vilella,et al.  A high-resolution map of human evolutionary constraint using 29 mammals , 2011, Nature.

[64]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[65]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[66]  Timothy J. Durham,et al.  Systematic analysis of chromatin state dynamics in nine human cell types , 2011, Nature.

[67]  B. Bernstein,et al.  Charting histone modifications and the functional organization of mammalian genomes , 2011, Nature Reviews Genetics.

[68]  R. Young,et al.  Histone H3K27ac separates active from poised enhancers and predicts developmental state , 2010, Proceedings of the National Academy of Sciences.

[69]  Mazhar Adli,et al.  Genome-wide chromatin maps derived from limited numbers of hematopoietic progenitors , 2010, Nature Methods.

[70]  Manolis Kellis,et al.  Discovery and characterization of chromatin states for systematic annotation of the human genome , 2010, Nature Biotechnology.

[71]  G. Kreiman,et al.  Widespread transcription at neuronal activity-regulated enhancers , 2010, Nature.

[72]  Skipper Seabold,et al.  Statsmodels: Econometric and Statistical Modeling with Python , 2010, SciPy.

[73]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[74]  Leighton J. Core,et al.  Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters , 2008, Science.

[75]  A. Krogh,et al.  A code for transcription initiation in mammalian genomes. , 2007, Genome research.

[76]  T. Mikkelsen,et al.  Genome-wide maps of chromatin state in pluripotent and lineage-committed cells , 2007, Nature.

[77]  Inna Dubchak,et al.  VISTA Enhancer Browser—a database of tissue-specific human enhancers , 2006, Nucleic Acids Res..

[78]  Gerald P. Mckenny Human Traits , 2007 .

[79]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..

[80]  M. Ichihara,et al.  GDNF-inducible zinc finger protein 1 is a sequence-specific transcriptional repressor that binds to the HOXA10 gene regulatory region , 2005, Nucleic acids research.

[81]  B. Raaka,et al.  Epithelial-to-Mesenchymal Transition Generates Proliferative Human Islet Precursor Cells , 2004, Science.

[82]  A. Butte,et al.  Coordinated reduction of genes of oxidative metabolism in humans with insulin resistance and diabetes: Potential role of PGC1 and NRF1 , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[83]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[84]  A. Sharrocks The ETS-domain transcription factor family , 2001, Nature Reviews Molecular Cell Biology.

[85]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[86]  박귀태,et al.  A Type 2 Diabetes–Associated Functional Regulatory Variant in a Pancreatic Islet Enhancer at the ADCY5 Locus , 2017, Diabetes.