Anecdotes, data and regulatory modules

Beginning in the late 1980s, Eric Davidson's group at Cal Tech developed a modularity hypothesis of developmental gene regulation, showing that in an expanding number of cases, particular aspects of development were governed by compact ‘modules’ of transcription factor binding sites (TFBSs), and that these modules were separable, complex and interconnected. Davidson made no attempt to further generalize the hypothesis, but others took up the idea, transported it out of development and extended it to a general rule of clustering. Despite such misbegotten origins, the ‘extended’ modularity hypothesis—that TFBSs in general tend to come in compact clusters—has been highly productive, yet it has never been challenged with a large, diverse and unbiased dataset to see how universal it actually is. The aim of the present paper is to do so. Applying human–mouse–rat phylogenetic footprinting to neighbourhoods of a diverse set of TFBSs, including both developmental and non-developmental signals, we find that the extended hypothesis holds in at least 93.5% of cases. Based on this particular sample, we found a mean module length of 609 nucleotides containing, on an average, 24.5 presumptive regulatory signals of length greater than 5 and averaging 8.5 nucleotides each.

[1]  J. Fickett,et al.  Identification of regulatory regions which confer muscle-specific gene expression. , 1998, Journal of molecular biology.

[2]  E. Davidson,et al.  The hardwiring of development: organization and function of genomic regulatory systems. , 1997, Development.

[3]  Z. Weng,et al.  Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences. , 2002, Nucleic acids research.

[4]  E. Birney,et al.  Comparative genomics: genome-wide analysis in metazoan eukaryotes , 2003, Nature Reviews Genetics.

[5]  E. Davidson Genomic Regulatory Systems: Development and Evolution , 2005 .

[6]  J. Stone,et al.  Rapid evolution of cis-regulatory sequences via local point mutations. , 2001, Molecular biology and evolution.

[7]  W. Wasserman,et al.  A predictive model for regulatory sequences directing liver-specific transcription. , 2001, Genome research.

[8]  William Stafford Noble,et al.  Searching for statistically significant regulatory modules , 2003, ECCB.

[9]  J. Fickett,et al.  Discovery and modeling of transcriptional regulatory regions. , 2000, Current opinion in biotechnology.

[10]  M. Goodman,et al.  Embryonic ε and γ globin genes of a prosimian primate (Galago crassicaudatus): Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints , 1988 .

[11]  K. Nakai,et al.  Sequence comparison of human and mouse genes reveals a homologous block structure in the promoter regions. , 2004, Genome research.

[12]  Sridhar Hannenhalli,et al.  Enrichment of regulatory signals in conserved non-coding genomic sequence , 2001, Bioinform..

[13]  J. Avise,et al.  Evolving genomic metaphors: a new look at the language of DNA. , 2001, Science.

[14]  E. Davidson,et al.  Spatial and temporal information processing in the sea urchin embryo: modular and intramodular organization of the CyIIIa gene cis-regulatory system. , 1996, Development.

[15]  R. Blomhoff,et al.  Gene expression regulation by retinoic acid Published, JLR Papers in Press, August 16, 2002. DOI 10.1194/jlr.R100015-JLR200 , 2002, Journal of Lipid Research.

[16]  G. Rubin,et al.  Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  W. Miller,et al.  Distinguishing regulatory DNA from neutral sites. , 2003, Genome research.

[18]  W. Miller,et al.  Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. , 2000, Science.

[19]  H. Gronemeyer,et al.  Transcription factors 3: nuclear receptors. , 1995, Protein profile.