Predicting Spatial and Temporal Gene Expression Using an Integrative Model of Transcription Factor Occupancy and Chromatin State

Precise patterns of spatial and temporal gene expression are central to metazoan complexity and act as a driving force for embryonic development. While there has been substantial progress in dissecting and predicting cis-regulatory activity, our understanding of how information from multiple enhancer elements converge to regulate a gene's expression remains elusive. This is in large part due to the number of different biological processes involved in mediating regulation as well as limited availability of experimental measurements for many of them. Here, we used a Bayesian approach to model diverse experimental regulatory data, leading to accurate predictions of both spatial and temporal aspects of gene expression. We integrated whole-embryo information on transcription factor recruitment to multiple cis-regulatory modules, insulator binding and histone modification status in the vicinity of individual gene loci, at a genome-wide scale during Drosophila development. The model uses Bayesian networks to represent the relation between transcription factor occupancy and enhancer activity in specific tissues and stages. All parameters are optimized in an Expectation Maximization procedure providing a model capable of predicting tissue- and stage-specific activity of new, previously unassayed genes. Performing the optimization with subsets of input data demonstrated that neither enhancer occupancy nor chromatin state alone can explain all gene expression patterns, but taken together allow for accurate predictions of spatio-temporal activity. Model predictions were validated using the expression patterns of more than 600 genes recently made available by the BDGP consortium, demonstrating an average 15-fold enrichment of genes expressed in the predicted tissue over a naïve model. We further validated the model by experimentally testing the expression of 20 predicted target genes of unknown expression, resulting in an accuracy of 95% for temporal predictions and 50% for spatial. While this is, to our knowledge, the first genome-wide approach to predict tissue-specific gene expression in metazoan development, our results suggest that integrative models of this type will become more prevalent in the future.

[1]  James B. Brown,et al.  Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions , 2009, Genome Biology.

[2]  Michael A. Beer,et al.  Predicting Gene Expression from Sequence , 2004, Cell.

[3]  Axel Visel,et al.  Functional autonomy of distant-acting human enhancers. , 2009, Genomics.

[4]  M. Fujioka,et al.  A chromatin insulator mediates transgene homing and very long-range enhancer-promoter communication , 2009, Development.

[5]  Robert P Zinzen,et al.  Divergence in cis-regulatory networks: taking the 'species' out of cross-species analysis , 2008, Genome Biology.

[6]  W. Wong,et al.  ChIP-Seq of transcription factors predicts absolute and differential gene expression in embryonic stem cells , 2009, Proceedings of the National Academy of Sciences.

[7]  K. White,et al.  Patterns of Gene Expression During Drosophila Mesoderm Development , 2001, Science.

[8]  P N Goodfellow,et al.  Deletion of long-range regulatory elements upstream of SOX9 causes campomelic dysplasia. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[9]  E. Furlong,et al.  Combinatorial binding predicts spatio-temporal cis-regulatory activity , 2009, Nature.

[10]  Saurabh Sinha,et al.  ChIPs and regulatory bits , 2010, Nature Biotechnology.

[11]  Denis Duboule,et al.  A Global Control Region Defines a Chromosomal Regulatory Landscape Containing the HoxD Cluster , 2003, Cell.

[12]  Mary Hoff Loopy Chromatin Brings Distant DNA to Bear on Silencing Promoter Genes , 2008, PLoS biology.

[13]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[14]  M. Frasch,et al.  tinman and bagpipe: two homeo box genes that determine cell fates in the dorsal mesoderm of Drosophila. , 1993, Genes & development.

[15]  E. Furlong,et al.  A core transcriptional network for early mesoderm development in Drosophila melanogaster. , 2007, Genes & development.

[16]  Erik Splinter,et al.  The complex transcription regulatory landscape of our genome: control in three dimensions , 2011, The EMBO journal.

[17]  Bartek Wilczyński,et al.  Dynamic CRM occupancy reflects a temporal map of developmental progression , 2010, Molecular systems biology.

[18]  J. Hanley,et al.  A method of comparing the areas under receiver operating characteristic curves derived from the same cases. , 1983, Radiology.

[19]  Michael P. Eichenlaub,et al.  A temporal map of transcription factor activity: mef2 directly regulates target genes at all stages of muscle development. , 2006, Developmental cell.

[20]  Gos Micklem,et al.  Supporting Online Material Materials and Methods Figs. S1 to S50 Tables S1 to S18 References Identification of Functional Elements and Regulatory Circuits by Drosophila Modencode , 2022 .

[21]  David H. Sharp,et al.  Quantitative and predictive model of transcriptional control of the Drosophila melanogaster even skipped gene , 2006, Nature Genetics.

[22]  Kevin Y. Yip,et al.  A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets , 2011, Genome Biology.

[23]  M. Hosoya,et al.  Elimination of a long-range cis-regulatory module causes complete loss of limb-specific Shh expression and truncation of the mouse limb , 2005, Development.

[24]  M. Levine,et al.  hairy mediates dominant repression in the Drosophila embryo , 1997, The EMBO journal.

[25]  Guillaume Valentin,et al.  A systematic analysis of Tinman function reveals Eya and JAK-STAT signaling as essential regulators of muscle development. , 2009, Developmental cell.

[26]  Michael Levine,et al.  Multiple enhancers ensure precision of gap gene-expression patterns in the Drosophila embryo , 2011, Proceedings of the National Academy of Sciences.

[27]  A. Visel,et al.  ChIP-seq accurately predicts tissue-specific activity of enhancers , 2009, Nature.

[28]  Scott Barolo,et al.  Shadow enhancers: Frequently asked questions about distributed cis‐regulatory information and enhancer redundancy , 2012, BioEssays : news and reviews in molecular, cellular and developmental biology.

[29]  Bartek Wilczynski,et al.  BNFinder: exact and efficient method for learning Bayesian networks , 2008, Bioinform..

[30]  Xiaoyu Chen,et al.  Prediction of tissue-specific cis-regulatory modules using Bayesian networks and regression trees , 2007, BMC Bioinformatics.

[31]  M. Frasch,et al.  biniou (FoxF), a central component in a regulatory network controlling visceral mesoderm development and midgut morphogenesis in Drosophila. , 2001, Genes & development.

[32]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[33]  R. Maeda,et al.  Probing long-distance regulatory interactions in the Drosophila melanogaster bithorax complex using Dam identification , 2006, Nature Genetics.

[34]  M. Ashburner,et al.  Systematic determination of patterns of gene expression during Drosophila embryogenesis , 2002, Genome Biology.

[35]  E. Myers,et al.  Finishing a whole-genome shotgun: Release 3 of the Drosophila melanogaster euchromatic genome sequence , 2002, Genome Biology.

[36]  Nathaniel D. Heintzman,et al.  Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome , 2007, Nature Genetics.

[37]  A. West,et al.  Insulators and boundaries: versatile regulatory elements in the eukaryotic genome. , 2001, Science.

[38]  G. Rubin,et al.  Global analysis of patterns of gene expression during Drosophila embryogenesis , 2007, Genome Biology.

[39]  Yuzhong Cheng,et al.  Enhancer-promoter communication at the Drosophila engrailed locus , 2009, Development.

[40]  M. Levine,et al.  Shadow Enhancers as a Source of Evolutionary Novelty , 2008, Science.

[41]  M. Gerstein,et al.  Unlocking the secrets of the genome , 2009, Nature.

[42]  Julia A. Lasserre,et al.  Histone modification levels are predictive for gene expression , 2010, Proceedings of the National Academy of Sciences.

[43]  Boris Lenhard,et al.  Genomic regulatory blocks underlie extensive microsynteny conservation in insects. , 2007, Genome research.

[44]  Y. Delotto,et al.  Structure and regulation of a complex locus: the cut gene of Drosophila. , 1995, Genetics.

[45]  Wouter de Laat,et al.  A Regulatory Archipelago Controls Hox Genes Transcription in Digits , 2011, Cell.

[46]  E. Segal,et al.  Predicting expression patterns from regulatory sequence in Drosophila segmentation , 2008, Nature.

[47]  Thomas Sandmann,et al.  Temporal ChIP-on-chip reveals Biniou as a universal regulator of the visceral muscle transcriptional network. , 2007, Genes & development.

[48]  Christopher D. Brown,et al.  A Comprehensive Map of Insulator Elements for the Drosophila Genome , 2010, PLoS genetics.

[49]  M. Fujioka,et al.  Non-additive interactions involving two distinct elements mediate sloppy-paired regulation by pair-rule transcription factors. , 2010, Developmental biology.

[50]  Raymond K. Auerbach,et al.  Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation , 2012, Cell.

[51]  D. W. Knowles,et al.  Transcription Factors Bind Thousands of Active and Inactive Regions in the Drosophila Blastoderm , 2008, PLoS biology.

[52]  Megan F. Cole,et al.  Genome-wide Map of Nucleosome Acetylation and Methylation in Yeast , 2005, Cell.