Probabilistic modelling of chromatin code landscape reveals functional diversity of enhancer-like chromatin states

Interpreting the functional state of chromatin from the combinatorial binding patterns of chromatin factors, that is, the chromatin codes, is crucial for decoding the epigenetic state of the cell. Here we present a systematic map of Drosophila chromatin states derived from data-driven probabilistic modelling of dependencies between chromatin factors. Our model not only recapitulates enhancer-like chromatin states as indicated by widely used enhancer marks but also divides these states into three functionally distinct groups, of which only one specific group possesses active enhancer activity. Moreover, we discover a strong association between one specific enhancer state and RNA Polymerase II pausing, linking transcription regulatory potential and chromatin organization. We also observe that with the exception of long-intron genes, chromatin state transition positions in transcriptionally active genes align with an absolute distance to their corresponding transcription start site, regardless of gene length. Using our method, we provide a resource that helps elucidate the functional and spatial organization of the chromatin code landscape.

[1]  Steven M. Gallo,et al.  REDfly v3.0: toward a comprehensive database of transcriptional regulatory elements in Drosophila , 2010, Nucleic Acids Res..

[2]  R. Young,et al.  Histone H3K27ac separates active from poised enhancers and predicts developmental state , 2010, Proceedings of the National Academy of Sciences.

[3]  Jian Zhou,et al.  Global Quantitative Modeling of Chromatin Factor Interactions , 2014, PLoS Comput. Biol..

[4]  Leighton J. Core,et al.  Nascent RNA Sequencing Reveals Widespread Pausing and Divergent Initiation at Human Promoters , 2008, Science.

[5]  Ryan A. Flynn,et al.  A unique chromatin signature uncovers early developmental enhancers in humans , 2011, Nature.

[6]  Łukasz M. Boryń,et al.  Genome-Wide Quantitative Enhancer Activity Maps Identified by STARR-seq , 2013, Science.

[7]  Guillaume J. Filion,et al.  Systematic Protein Location Mapping Reveals Five Principal Chromatin Types in Drosophila Cells , 2010, Cell.

[8]  S. Teichmann,et al.  RNA sequencing reveals two major classes of gene expression levels in metazoan cells , 2011, Molecular systems biology.

[9]  J. Lis,et al.  DNA sequence requirements for generating paused polymerase at the start of hsp70. , 1992, Genes & development.

[10]  Bartek Wilczynski,et al.  Predicting Spatial and Temporal Gene Expression Using an Integrative Model of Transcription Factor Occupancy and Chromatin State , 2012, PLoS Comput. Biol..

[11]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[13]  Alistair N Boettiger,et al.  Synchronous and Stochastic Patterns of Gene Activation in the Drosophila Embryo , 2009, Science.

[14]  Julia Zeitlinger,et al.  A global change in RNA polymerase II pausing during the Drosophila midblastula transition , 2013, eLife.

[15]  Thomas Conrad,et al.  Dosage compensation in Drosophila melanogaster: epigenetic fine-tuning of chromosome-wide transcription , 2012, Nature Reviews Genetics.

[16]  J. Lis,et al.  HSF access to heat shock elements in vivo depends critically on promoter architecture defined by GAGA factor, TFIID, and RNA polymerase II binding sites. , 1995, Genes & development.

[17]  James B. Brown,et al.  Modeling gene expression using chromatin features in various cellular contexts , 2012, Genome Biology.

[18]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[19]  Nathaniel D. Heintzman,et al.  Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome , 2007, Nature Genetics.

[20]  David A. Hendrix,et al.  Promoter elements associated with RNA Pol II stalling in the Drosophila embryo , 2008, Proceedings of the National Academy of Sciences.

[21]  Kevin Y. Yip,et al.  A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets , 2011, Genome Biology.

[22]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[23]  Julia Zeitlinger,et al.  Paused Pol II Coordinates Tissue Morphogenesis in the Drosophila Embryo , 2013, Cell.

[24]  Lee E. Edsall,et al.  A map of the cis-regulatory sequences in the mouse genome , 2012, Nature.

[25]  Mark D. Biggin,et al.  NELF and GAGA Factor Are Linked to Promoter-Proximal Pausing at Many Genes in Drosophila , 2008, Molecular and Cellular Biology.

[26]  Manolis Kellis,et al.  Discovery and characterization of chromatin states for systematic annotation of the human genome , 2010, Nature Biotechnology.

[27]  Paul C. Leyland,et al.  FlyBase: improvements to the bibliography , 2012, Nucleic Acids Res..

[28]  William Stafford Noble,et al.  Unsupervised pattern discovery in human chromatin structure through genomic segmentation , 2012, Nature Methods.

[29]  Lovelace J. Luquette,et al.  Comprehensive analysis of the chromatin landscape in Drosophila , 2010, Nature.

[30]  B. Franklin Pugh,et al.  Kinetic competition between elongation rate and binding of NELF controls promoter-proximal pausing. , 2013, Molecular cell.

[31]  Gos Micklem,et al.  Supporting Online Material Materials and Methods Figs. S1 to S50 Tables S1 to S18 References Identification of Functional Elements and Regulatory Circuits by Drosophila Modencode , 2022 .

[32]  Raymond K. Auerbach,et al.  Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project , 2010, Science.

[33]  D. Halligan,et al.  Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison. , 2006, Genome research.

[34]  Manolis Kellis,et al.  RNA polymerase stalling at developmental control genes in the Drosophila melanogaster embryo , 2007, Nature Genetics.

[35]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[36]  Ruchir Shah,et al.  RNA polymerase is poised for activation across the genome , 2007, Nature Genetics.