Motif-blind, genome-wide discovery of cis-regulatory modules in Drosophila and mouse.

We present new approaches to cis-regulatory module (CRM) discovery in the common scenario where relevant transcription factors and/or motifs are unknown. Beginning with a small list of CRMs mediating a common gene expression pattern, we search genome-wide for CRMs with similar functionality, using new statistical scores and without requiring known motifs or accurate motif discovery. We cross-validate our predictions on 31 regulatory networks in Drosophila and through correlations with gene expression data. Five predicted modules tested using an in vivo reporter gene assay all show tissue-specific regulatory activity. We also demonstrate our methods' ability to predict mammalian tissue-specific enhancers. Finally, we predict human CRMs that regulate early blood and cardiovascular development. In vivo transgenic mouse analysis of two predicted CRMs demonstrates that both have appropriate enhancer activity. Overall, 7/7 predictions were validated successfully in vivo, demonstrating the effectiveness of our approach for insect and mammalian genomes.

[1]  Z. Weng,et al.  Detection of functional DNA motifs via statistical over-representation. , 2004, Nucleic acids research.

[2]  E. Davidson,et al.  The hardwiring of development: organization and function of genomic regulatory systems. , 1997, Development.

[3]  Dmitri A. Papatsenko,et al.  Statistical extraction of Drosophila cis-regulatory modules using exhaustive assessment of local word frequency , 2003, BMC Bioinformatics.

[4]  M. Noyes,et al.  A systematic characterization of factors that regulate Drosophila segmentation via a bacterial one-hybrid system , 2008, Nucleic acids research.

[5]  R. Maeda,et al.  An optimized transgenesis system for Drosophila using germ-line-specific φC31 integrases , 2007, Proceedings of the National Academy of Sciences.

[6]  Walter R. Gilks,et al.  Some statistical properties of regulatory DNA sequences, and their use in predicting regulatory regions in the Drosophila genome: the fluffy-tail test , 2004, BMC Bioinformatics.

[7]  Steven M. Gallo,et al.  REDfly 2.0: an integrated database of cis-regulatory modules and transcription factor binding sites in Drosophila , 2007, Nucleic Acids Res..

[8]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[9]  Marc S Halfon,et al.  Computation-based discovery of related transcriptional regulatory modules and motifs using an experimentally validated combinatorial model. , 2002, Genome research.

[10]  Martin C. Frith,et al.  Cluster-Buster: finding dense clusters of motifs in DNA sequences , 2003, Nucleic Acids Res..

[11]  Massimo Vergassola,et al.  Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo , 2002, BMC Bioinformatics.

[12]  Ivan Ovcharenko,et al.  ECRbase: database of evolutionary conserved regions, promoters, and transcription factor binding sites in vertebrate genomes , 2007, Bioinform..

[13]  G. Rubin,et al.  Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[14]  G. Stormo,et al.  Identification of a novel cis-regulatory element involved in the heat shock response in Caenorhabditis elegans using microarray gene expression and computational methods. , 2002, Genome research.

[15]  Jiang Qian,et al.  Identification of tissue-specific cis-regulatory modules based on interactions between transcription factors , 2007, BMC Bioinformatics.

[16]  Saurabh Sinha,et al.  A probabilistic method to detect regulatory modules , 2003, ISMB.

[17]  E. Davidson The Regulatory Genome: Gene Regulatory Networks In Development And Evolution , 2006 .

[18]  Marc S Halfon,et al.  Computational discovery of cis-regulatory modules in Drosophila without prior knowledge of motifs , 2008, Genome Biology.

[19]  M. Busslinger,et al.  Lineage commitment in lymphopoiesis. , 2000, Current opinion in immunology.

[20]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[21]  Dennis F. Kibler,et al.  Using hexamers to predict cis-regulatory motifs in Drosophila , 2005, BMC Bioinformatics.

[22]  N. Gostling,et al.  From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design , 2002, Heredity.

[23]  Alan M. Moses,et al.  In vivo enhancer analysis of human conserved non-coding sequences , 2006, Nature.

[24]  Berthold Göttgens,et al.  TFBScluster: a resource for the characterization of transcriptional regulatory networks , 2005, Bioinform..

[25]  Marc S Halfon,et al.  Large-scale analysis of transcriptional cis-regulatory modules reveals both common features and distinct subclasses , 2007, Genome Biology.

[26]  B. Göttgens,et al.  Endoglin expression in the endothelium is regulated by Fli-1, Erg, and Elf-1 acting on the promoter and a -8-kb enhancer. , 2006, Blood.

[27]  S. Carroll,et al.  From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design , 2000 .

[28]  M. Laubichler Review of: Carroll, Sean B., Jennifer K. Grenier and Scott D. Weatherbee: From DNA to diversity : molecular genetics and the evolution of animal design. Malden, Mass [u.a.]: Blackwell Science 2001 , 2003 .

[29]  Xiaoyu Chen,et al.  Prediction of tissue-specific cis-regulatory modules using Bayesian networks and regression trees , 2007, BMC Bioinformatics.

[30]  Marc S. Halfon,et al.  Prediction of similarly acting cis-regulatory modules by subsequence profiling and comparative genomics in Drosophila melanogaster and D.pseudoobscura , 2004, Bioinform..

[31]  Subhajyoti De,et al.  BloodExpress: a database of gene expression in mouse haematopoiesis , 2008, Nucleic Acids Res..

[32]  A. Philippakis,et al.  Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities , 2006, Nature Biotechnology.

[33]  Saurabh Sinha,et al.  A statistical method for alignment-free comparison of regulatory sequences , 2007, ISMB/ECCB.

[34]  A. Visel,et al.  ChIP-seq accurately predicts tissue-specific activity of enhancers , 2009, Nature.

[35]  J. Skeath,et al.  Characterization of a novel subset of cardiac cells and their progenitors in the Drosophila embryo. , 2000, Development.

[36]  E. Ukkonen,et al.  Genome-wide Prediction of Mammalian Enhancers Based on Analysis of Transcription-Factor Binding Affinity , 2006, Cell.

[37]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[38]  Saurabh Sinha,et al.  A Statistical Method for Finding Transcription Factor Binding Sites , 2000, ISMB.

[39]  Piero Carninci,et al.  The Drosophila gene collection: identification of putative full-length cDNAs for 70% of D. melanogaster genes. , 2002, Genome research.

[40]  Anthony A. Philippakis,et al.  ModuleFinder: A Tool for Computational Discovery of Cis Regulatory Modules , 2004, Pacific Symposium on Biocomputing.

[41]  Ivan Ovcharenko,et al.  Predicting tissue-specific enhancers in the human genome. , 2006, Genome research.

[42]  B. Göttgens,et al.  Fli1, Elf1, and Ets1 regulate the proximal promoter of the LMO2 gene in endothelial cells. , 2005, Blood.

[43]  Ole Winther,et al.  JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update , 2007, Nucleic Acids Res..

[44]  D. W. Knowles,et al.  Transcription Factors Bind Thousands of Active and Inactive Regions in the Drosophila Blastoderm , 2008, PLoS biology.

[45]  Andrea Califano,et al.  Discovering transcriptional regulatory regions in Drosophila by a nonalignment method for phylogenetic footprinting , 2007, Proceedings of the National Academy of Sciences.

[46]  Michael Q. Zhang,et al.  Tissue-specific Regulatory Elements in Mammalian Promoters: Supplementary Information 1 Transcripts and Promoters under Tissue-specific Regulation , 2022 .

[47]  J. Fak,et al.  Transcriptional Control in the Segmentation Gene Network of Drosophila , 2004, PLoS biology.

[48]  J. Reinitz,et al.  Rapid preparation of a panel of polyclonal antibodies to Drosophila segmentation proteins , 1998, Development Genes and Evolution.