Clustered ChIP-Seq-defined transcription factor binding sites and histone modifications map distinct classes of regulatory elements

BackgroundTranscription factor binding to DNA requires both an appropriate binding element and suitably open chromatin, which together help to define regulatory elements within the genome. Current methods of identifying regulatory elements, such as promoters or enhancers, typically rely on sequence conservation, existing gene annotations or specific marks, such as histone modifications and p300 binding methods, each of which has its own biases.ResultsHerein we show that an approach based on clustering of transcription factor peaks from high-throughput sequencing coupled with chromatin immunoprecipitation (Chip-Seq) can be used to evaluate markers for regulatory elements. We used 67 data sets for 54 unique transcription factors distributed over two cell lines to create regulatory element clusters. By integrating the clusters from our approach with histone modifications and data for open chromatin, we identified general methylation of lysine 4 on histone H3 (H3K4me) as the most specific marker for transcription factor clusters. Clusters mapping to annotated genes showed distinct patterns in cluster composition related to gene expression and histone modifications. Clusters mapping to intergenic regions fall into two groups either directly involved in transcription, including miRNAs and long noncoding RNAs, or facilitating transcription by long-range interactions. The latter clusters were specifically enriched with H3K4me1, but less with acetylation of lysine 27 on histone 3 or p300 binding.ConclusionBy integrating genomewide data of transcription factor binding and chromatin structure and using our data-driven approach, we pinpointed the chromatin marks that best explain transcription factor association with different regulatory elements. Our results also indicate that a modest selection of transcription factors may be sufficient to map most regulatory elements in the human genome.

[1]  Finn Drabløs,et al.  A manually curated ChIP-seq benchmark demonstrates room for improvement in current peak-finder programs , 2010, Nucleic acids research.

[2]  A. Visel,et al.  Homotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers. , 2010, Genome research.

[3]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[4]  R. Young,et al.  Histone H3K27ac separates active from poised enhancers and predicts developmental state , 2010, Proceedings of the National Academy of Sciences.

[5]  Matthew W. Hahn,et al.  The evolution of transcriptional regulation in eukaryotes. , 2003, Molecular biology and evolution.

[6]  P. Park ChIP–seq: advantages and challenges of a maturing technology , 2009, Nature Reviews Genetics.

[7]  S. Orkin,et al.  An Extended Transcriptional Network for Pluripotency of Embryonic Stem Cells (DOI:10.1016/j.cell.2008.02.039) , 2008 .

[8]  Howard Y. Chang,et al.  Long Noncoding RNA as Modular Scaffold of Histone Modification Complexes , 2010, Science.

[9]  G. Kreiman,et al.  Widespread transcription at neuronal activity-regulated enhancers , 2010, Nature.

[10]  Bing Li,et al.  The Role of Chromatin during Transcription , 2007, Cell.

[11]  Megan F. Cole,et al.  Connecting microRNA Genes to the Core Transcriptional Regulatory Circuitry of Embryonic Stem Cells , 2008, Cell.

[12]  Salvatore Spicuglia,et al.  A unique H3K4me2 profile marks tissue-specific gene regulation. , 2010, Genome research.

[13]  B. Ren,et al.  Genome-wide prediction of transcription factor binding sites using an integrated model , 2010, Genome Biology.

[14]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[15]  Gos Micklem,et al.  Supporting Online Material Materials and Methods Figs. S1 to S50 Tables S1 to S18 References Identification of Functional Elements and Regulatory Circuits by Drosophila Modencode , 2022 .

[16]  F. Robert,et al.  Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression , 2006 .

[17]  Juan M. Vaquerizas,et al.  A census of human transcription factors: function, expression and evolution , 2009, Nature Reviews Genetics.

[18]  Alan M. Moses,et al.  In vivo enhancer analysis of human conserved non-coding sequences , 2006, Nature.

[19]  Bing Ren,et al.  ChromaSig: A Probabilistic Approach to Finding Common Chromatin Signatures in the Human Genome , 2008, PLoS Comput. Biol..

[20]  J. Han,et al.  Inferring causal relationships among different histone modifications and gene expression. , 2008, Genome research.

[21]  Dustin E. Schones,et al.  High-Resolution Profiling of Histone Methylations in the Human Genome , 2007, Cell.

[22]  David A. Orlando,et al.  Mediator and Cohesin Connect Gene Expression and Chromatin Architecture , 2010, Nature.

[23]  T. Mikkelsen,et al.  Genome-wide maps of chromatin state in pluripotent and lineage-committed cells , 2007, Nature.

[24]  W. Wong,et al.  CisModule: de novo discovery of cis-regulatory modules by hierarchical mixture modeling. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[25]  A. Visel,et al.  ChIP-seq accurately predicts tissue-specific activity of enhancers , 2009, Nature.

[26]  Shane C. Dillon,et al.  The landscape of histone modifications across 1% of the human genome in five human cell lines. , 2007, Genome research.

[27]  Feng Lin,et al.  A signal-noise model for significance analysis of ChIP-seq with negative control , 2010, Bioinform..

[28]  V. Corces,et al.  CTCF: Master Weaver of the Genome , 2009, Cell.

[29]  T. Kouzarides Chromatin Modifications and Their Function , 2007, Cell.

[30]  P. Scacheri,et al.  Epigenetic signatures distinguish multiple classes of enhancers with distinct cellular functions. , 2011, Genome research.

[31]  P. Park,et al.  Design and analysis of ChIP-seq experiments for DNA-binding proteins , 2008, Nature Biotechnology.

[32]  A. Hoffman,et al.  CTCF Regulates Allelic Expression of Igf2 by Orchestrating a Promoter-Polycomb Repressive Complex 2 Intrachromosomal Loop , 2008, Molecular and Cellular Biology.

[33]  M. Nóbrega,et al.  Scanning Human Gene Deserts for Long-Range Enhancers , 2003, Science.

[34]  A. Mortazavi,et al.  Genome-Wide Mapping of in Vivo Protein-DNA Interactions , 2007, Science.

[35]  Richard A Young,et al.  Short RNAs are transcribed from repressed polycomb target genes and interact with polycomb repressive complex-2. , 2010, Molecular cell.

[36]  Abraham P. Fong,et al.  Genome-wide transcription factor binding: beyond direct target regulation. , 2011, Trends in genetics : TIG.

[37]  Bing Ren,et al.  Prediction of regulatory elements in mammalian genomes using chromatin signatures , 2008, BMC Bioinformatics.

[38]  James A. Cuff,et al.  A Bivalent Chromatin Structure Marks Key Developmental Genes in Embryonic Stem Cells , 2006, Cell.

[39]  Chen Zeng,et al.  A clustering approach for identification of enriched domains from histone modification ChIP-Seq data , 2009, Bioinform..

[40]  I. Wood,et al.  Chromatin crosstalk in development and disease: lessons from REST , 2007, Nature Reviews Genetics.

[41]  Dustin E. Schones,et al.  Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. , 2008, Genome research.

[42]  Mark Gerstein,et al.  Diverse transcription factor binding features revealed by genome-wide ChIP-seq in C. elegans. , 2011, Genome research.

[43]  Raja Jothi,et al.  Genome-wide identification of in vivo protein–DNA binding sites from ChIP-Seq data , 2008, Nucleic acids research.

[44]  N. D. Clarke,et al.  Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells , 2008, Cell.

[45]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[46]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[47]  BMC Biology , 2004 .

[48]  Nathaniel D. Heintzman,et al.  Histone modifications at human enhancers reflect global cell-type-specific gene expression , 2009, Nature.

[49]  Z. Weng,et al.  High-Resolution Mapping and Characterization of Open Chromatin across the Genome , 2008, Cell.

[50]  S. Orkin,et al.  An Extended Transcriptional Network for Pluripotency of Embryonic Stem Cells , 2008, Cell.

[51]  Ryan A. Flynn,et al.  A unique chromatin signature uncovers early developmental enhancers in humans , 2011, Nature.

[52]  P. Farnham Insights from genomic profiling of transcription factors , 2009, Nature Reviews Genetics.

[53]  Karen L. Mohlke,et al.  A map of open chromatin in human pancreatic islets , 2010, Nature Genetics.

[54]  Michael F. Lin,et al.  Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals , 2009, Nature.

[55]  Nathaniel D. Heintzman,et al.  Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome , 2007, Nature Genetics.

[56]  Manolis Kellis,et al.  Discovery and characterization of chromatin states for systematic annotation of the human genome , 2010, Nature Biotechnology.

[57]  Lovelace J. Luquette,et al.  Comprehensive analysis of the chromatin landscape in Drosophila , 2010, Nature.

[58]  E. Lander,et al.  The Mammalian Epigenome , 2007, Cell.

[59]  Ariel S. Schwartz,et al.  An Atlas of Combinatorial Transcriptional Regulation in Mouse and Man , 2010, Cell.

[60]  Jun S. Liu,et al.  De novo cis-regulatory module elicitation for eukaryotic genomes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Jan Komorowski,et al.  Molecular interactions between HNF4a, FOXA2 and GABP identified at regulatory DNA elements through ChIP-sequencing , 2009, Nucleic acids research.

[62]  Bing Ren,et al.  Discovery and Annotation of Functional Chromatin Signatures in the Human Genome , 2009, PLoS Comput. Biol..

[63]  M. Groudine,et al.  Functional and Mechanistic Diversity of Distal Transcription Enhancers , 2011, Cell.

[64]  Michael Q. Zhang,et al.  Combinatorial patterns of histone acetylations and methylations in the human genome , 2008, Nature Genetics.

[65]  Michael Q. Zhang,et al.  Analysis of the Vertebrate Insulator Protein CTCF-Binding Sites in the Human Genome , 2007, Cell.