The identification and functional characterisation of conserved regulatory elements in developmental genes.

Understanding the mechanisms that govern the expression of genomes is one of the major challenges of the post-genomic era. Phylogenetic footprinting, which identifies genomic regions under evolutionary constraints, has proven helpful in finding cis-regulatory elements of transcription; however, this method may not be applicable across all evolutionary distances and for all types of genes. Recent results from vertebrate comparisons indicate that strong conservation of cis-regulatory regions may occur more frequently in developmental regulator genes. This paper reviews methods of identifying conserved regulatory elements of developmental genes by comparative genomics, including new attempts to detect conserved features beyond simple sequence similarities. The results obtained are outlined and the authors comment on their functional and evolutionary implications. Finally, an evaluation of currently available methods of characterising the function of presumed conserved regulatory regions is presented, and problems such as promoter compatibility, assigning distant elements to their cognate genes and multifunctionality of elements, discussed.

[1]  Mark Rebeiz,et al.  SCORE: A computational approach to the identification of cis-regulatory modules and target genes in whole-genome sequence data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[2]  G. Church,et al.  Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation , 1998, Nature Biotechnology.

[3]  F. Ruddle,et al.  Comparative studies on mammalian Hoxc8 early enhancer sequence reveal a baleen whale-specific deletion of a cis-acting element. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[4]  R. Hardison Conserved noncoding sequences are reliable guides to regulatory elements. , 2000, Trends in genetics : TIG.

[5]  J. T. Kadonaga,et al.  Enhancer-promoter specificity mediated by DPE or TATA core promoter motifs. , 2001, Genes & development.

[6]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[7]  Shuo Lin,et al.  Artificial chromosome transgenesis reveals long-distance negative regulation of rag1 in zebrafish , 1999, Nature Genetics.

[8]  J. Costas,et al.  Turnover of binding sites for transcription factors involved in early Drosophila development. , 2003, Gene.

[9]  Marc S. Halfon,et al.  Prediction of similarly acting cis-regulatory modules by subsequence profiling and comparative genomics in Drosophila melanogaster and D.pseudoobscura , 2004, Bioinform..

[10]  Thomas Werner,et al.  Functional promoter modules can be detected by formal models independent of overall nucleotide sequence similarity , 1999, Bioinform..

[11]  Valérie Gailus-Durner,et al.  Experimental data of a single promoter can be used for in silico detection of genes with related regulation in the absence of sequence similarity , 2001, Mammalian Genome.

[12]  C. V. Jongeneel,et al.  Numerous potentially functional but non-genic conserved sequences on human chromosome 21 , 2002, Nature.

[13]  Ivan Ovcharenko,et al.  rVISTA 2.0: evolutionary analysis of transcription factor binding sites , 2004, Nucleic Acids Res..

[14]  Ivan Ovcharenko,et al.  ECR Browser: a tool for visualizing and accessing data from comparisons of multiple vertebrate genomes , 2004, Nucleic Acids Res..

[15]  Webb Miller,et al.  Comparative genome analysis delimits a chromosomal domain and identifies key regulatory elements in the α globin cluster , 2001 .

[16]  Terrence S. Furey,et al.  The UCSC Genome Browser Database , 2003, Nucleic Acids Res..

[17]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[18]  W. Miller,et al.  Distinguishing regulatory DNA from neutral sites. , 2003, Genome research.

[19]  N. Dillon,et al.  Functional gene expression domains: defining the functional unit of eukaryotic gene regulation. , 2000, BioEssays : news and reviews in molecular, cellular and developmental biology.

[20]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[21]  Lior Pachter,et al.  MAVID multiple alignment server , 2003, Nucleic Acids Res..

[22]  Serafim Batzoglou,et al.  Eukaryotic regulatory element conservation analysis and identification using comparative genomics. , 2004, Genome research.

[23]  M. Levine,et al.  Different core promoters possess distinct regulatory activities in the Drosophila embryo. , 1998, Genes & development.

[24]  James W Carman,et al.  Detection and visualization of compositionally similar cis-regulatory element clusters in orthologous and coordinately controlled genes. , 2002, Genome research.

[25]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[26]  Jeffrey H. Chuang,et al.  Functional Bias and Spatial Organization of Genes in Mutational Hot and Cold Regions in the Human Genome , 2004, PLoS biology.

[27]  William H. Majoros,et al.  A Comparison of Whole-Genome Shotgun-Derived Mouse Chromosome 16 and the Human Genome , 2002, Science.

[28]  J. Beckmann,et al.  Human-mouse differences in the embryonic expression patterns of developmental control genes and disease genes. , 2000, Human molecular genetics.

[29]  Burkhard Morgenstern,et al.  DIALIGN2: Improvement of the segment to segment approach to multiple sequence alignment , 1999, German Conference on Bioinformatics.

[30]  Inna Dubchak,et al.  Glocal alignment: finding rearrangements during alignment , 2003, ISMB.

[31]  L. Pachter,et al.  rVista for comparative sequence-based discovery of functional transcription factor binding sites. , 2002, Genome research.

[32]  D. Glover,et al.  Mutations in aurora prevent centrosome separation leading to the formation of monopolar spindles , 1995, Cell.

[33]  B. Oostra,et al.  A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. , 2003, Human molecular genetics.

[34]  Michael Brudno,et al.  The CHAOS/DIALIGN WWW server for multiple alignment of genomic sequences , 2004, Nucleic Acids Res..

[35]  M. Goodman,et al.  Embryonic epsilon and gamma globin genes of a prosimian primate (Galago crassicaudatus). Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints. , 1988, Journal of molecular biology.

[36]  U. K. Laemmli,et al.  Scaffold-associated regions: cis-acting determinants of chromatin structural loops and functional domains. , 1992, Current opinion in genetics & development.

[37]  Burkhard Morgenstern,et al.  DIALIGN: multiple DNA and protein sequence alignment at BiBiServ , 2004, Nucleic Acids Res..

[38]  C. Vinson,et al.  Clustering of DNA sequences in human promoters. , 2004, Genome research.

[39]  E. Birney,et al.  Comparative genomics: genome-wide analysis in metazoan eukaryotes , 2003, Nature Reviews Genetics.

[40]  D. Bartel,et al.  MicroRNA-Directed Cleavage of HOXB8 mRNA , 2004, Science.

[41]  C. Plessy,et al.  Enhancer sequence conservation between vertebrates is favoured in developmental regulator genes. , 2005, Trends in genetics : TIG.

[42]  Saurabh Sinha,et al.  A probabilistic method to detect regulatory modules , 2003, ISMB.

[43]  Eugene V Koonin,et al.  A significant fraction of conserved noncoding DNA in human and mouse consists of predicted matrix attachment regions. , 2003, Trends in genetics : TIG.

[44]  Marc S Halfon,et al.  Computation-based discovery of related transcriptional regulatory modules and motifs using an experimentally validated combinatorial model. , 2002, Genome research.

[45]  W. J. Kent,et al.  Conservation, regulation, synteny, and introns in a large-scale C. briggsae-C. elegans genomic alignment. , 2000, Genome research.

[46]  Eugene Berezikov,et al.  CONREAL: conserved regulatory elements anchored alignment algorithm for identification of transcription factor binding sites by phylogenetic footprinting. , 2003, Genome research.

[47]  C. Plessy,et al.  Expression profiling and comparative genomics identify a conserved regulatory region controlling midline expression in the zebrafish embryo. , 2004, Genome research.

[48]  E. Davidson,et al.  The hardwiring of development: organization and function of genomic regulatory systems. , 1997, Development.

[49]  D. Haussler,et al.  Article Identification and Characterization of Multi-Species Conserved Sequences , 2022 .

[50]  W Miller,et al.  Comparative genome analysis delimits a chromosomal domain and identifies key regulatory elements in the alpha globin cluster. , 2001, Human molecular genetics.

[51]  G. Rubin,et al.  Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[52]  S. Salzberg,et al.  Computational identification of developmental enhancers: conservation and function of transcription factor binding-site clusters in Drosophila melanogaster and Drosophila pseudoobscura , 2004, Genome Biology.

[53]  Alistair G. Rust,et al.  Ensembl 2002: accommodating comparative genomics , 2003, Nucleic Acids Res..

[54]  S. Cawley,et al.  Unbiased Mapping of Transcription Factor Binding Sites along Human Chromosomes 21 and 22 Points to Widespread Regulation of Noncoding RNAs , 2004, Cell.

[55]  Berthold Göttgens,et al.  Analysis of multiple genomic sequence alignments: a web resource, online tools, and lessons learned from analysis of mammalian SCL loci. , 2004, Genome research.

[56]  Angel Amores,et al.  Regulatory roles of conserved intergenic domains in vertebrate Dlx bigene clusters. , 2003, Genome research.

[57]  Lisa M. D'Souza,et al.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution , 2004, Nature.

[58]  R. Krumlauf,et al.  Selectivity, sharing and competitive interactions in the regulation of Hoxb genes , 1998, The EMBO journal.

[59]  S. Amacher Transcriptional regulation during zebrafish embryogenesis. , 1999, Current opinion in genetics & development.

[60]  S. Brenner,et al.  Detecting conserved regulatory elements with the model genome of the Japanese puffer fish, Fugu rubripes. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Ting Wang,et al.  Combining phylogenetic data with co-regulated genes to identify regulatory motifs , 2003, Bioinform..

[62]  C. Schuurmans,et al.  Conserved and acquired features of neurogenin1 regulation , 2004, Development.

[63]  Chuong B. Do,et al.  Access the most recent version at doi: 10.1101/gr.926603 References , 2003 .

[64]  G. Church,et al.  Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. , 2000, Journal of molecular biology.

[65]  K. Nakai,et al.  Sequence comparison of human and mouse genes reveals a homologous block structure in the promoter regions. , 2004, Genome research.

[66]  F. Müller,et al.  Intronic enhancers control expression of zebrafish sonic hedgehog in floor plate and notochord. , 1999, Development.

[67]  Peter W. Markstein,et al.  A regulatory code for neurogenic gene expression in the Drosophila embryo , 2004, Development.

[68]  M. Blanchette,et al.  Discovery of regulatory elements by a computational method for phylogenetic footprinting. , 2002, Genome research.

[69]  T. Werner,et al.  A novel method to develop highly specific models for regulatory units detects a new LTR in GenBank which contains a functional promoter. , 1997, Journal of molecular biology.

[70]  N. Patel,et al.  Evidence for stabilizing selection in a eukaryotic enhancer element , 2000, Nature.

[71]  Paul Richardson,et al.  The Draft Genome of Ciona intestinalis: Insights into Chordate and Vertebrate Origins , 2002, Science.

[72]  D. Church,et al.  Cross-species sequence comparisons: a review of methods and available resources. , 2003, Genome research.

[73]  R. Gibbs,et al.  Large-scale comparative sequence analysis of the human and murine Bruton's tyrosine kinase loci reveals conserved regulatory domains. , 1997, Genome research.

[74]  A. Childs,et al.  Conserved elements in Pax6 intron 7 involved in (auto)regulation and alternative transcription. , 2004, Developmental biology.

[75]  A. Clark,et al.  Evolution of transcription factor binding sites in Mammalian gene regulatory regions: conservation and turnover. , 2002, Molecular biology and evolution.

[76]  David Baltimore,et al.  One Nucleotide in a κB Site Can Determine Cofactor Specificity for NF-κB Dimers , 2004, Cell.

[77]  E. Davidson,et al.  Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene. , 1998, Science.

[78]  F. Müller,et al.  Search for enhancers: teleost models in comparative genomic and transgenic analysis of cis regulatory elements. , 2002, BioEssays : news and reviews in molecular, cellular and developmental biology.

[79]  A. Hoffmann,et al.  One nucleotide in a kappaB site can determine cofactor specificity for NF-kappaB dimers. , 2004, Cell.

[80]  H. Kondoh,et al.  Functional analysis of chicken Sox2 enhancers highlights an array of diverse regulatory elements that are conserved in mammals. , 2003, Developmental cell.

[81]  G. Stormo,et al.  Identification of a novel cis-regulatory element involved in the heat shock response in Caenorhabditis elegans using microarray gene expression and computational methods. , 2002, Genome research.

[82]  H. Kondoh,et al.  Efficient identification of regulatory sequences in the chicken genome by a powerful combination of embryo electroporation and genome comparison , 2004, Mechanisms of Development.

[83]  E. Davidson,et al.  Quantitative imaging of cis-regulatory reporters in living embryos , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[84]  Berthold Göttgens,et al.  Analysis of vertebrate SCL loci identifies conserved enhancers , 2000, Nature Biotechnology.

[85]  E. Davidson,et al.  Cis-regulatory logic in the endo16 gene: switching from a specification to a differentiation mode of control. , 2001, Development.

[86]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[87]  M. Scott,et al.  Automated sorting of live transgenic embryos , 2001, Nature Biotechnology.

[88]  I. Ovcharenko,et al.  eShadow: a tool for comparing closely related sequences. , 2004, Genome research.

[89]  Edwin Cuppen,et al.  Efficient target-selected mutagenesis in zebrafish. , 2003, Genome research.

[90]  L. Pachter,et al.  Strategies and tools for whole-genome alignments. , 2002, Genome research.

[91]  E. Birney,et al.  Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment. , 2004, Genome research.

[92]  Matthew W. Hahn,et al.  The evolution of transcriptional regulation in eukaryotes. , 2003, Molecular biology and evolution.

[93]  S. Ogbourne,et al.  Transcriptional control and the role of silencers in transcriptional regulation in eukaryotes. , 1998, The Biochemical journal.

[94]  Peter W. Markstein,et al.  Genome-wide analysis of clustered Dorsal binding sites identifies putative target genes in the Drosophila embryo , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[95]  M. Ekker,et al.  A Highly Conserved Enhancer in the Dlx5/Dlx6Intergenic Region is the Site of Cross-Regulatory Interactions betweenDlx Genes in the Embryonic Forebrain , 2000, The Journal of Neuroscience.

[96]  Michael Litt,et al.  The insulation of genes from external enhancers and silencing chromatin , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[97]  Y. Hérault,et al.  Evolutionary conserved sequences are required for the insulation of the vertebrate Hoxd complex in neural cells , 2002, Development.

[98]  Holger Karas,et al.  TRANSFAC: a database on transcription factors and their DNA binding sites , 1996, Nucleic Acids Res..

[99]  D. Duboule,et al.  Mouse limb deformity mutations disrupt a global control region within the large regulatory landscape required for Gremlin expression. , 2004, Genes & development.

[100]  Saurabh Sinha,et al.  Cross-species comparison significantly improves genome-wide prediction of cis-regulatory modules in Drosophila , 2004, BMC Bioinformatics.

[101]  Klaudia Walter,et al.  Highly Conserved Non-Coding Sequences Are Associated with Vertebrate Development , 2004, PLoS biology.

[102]  B. Aronow,et al.  Genomic sequence comparison of the human and mouse adenosine deaminase gene regions , 1999, Mammalian Genome.

[103]  Marc S. Halfon,et al.  Prediction of similarly-acting cis-regulatory modules by subsequence profiling and comparative genomics in D . melanogaster and D . pseudoobscura , 2004 .

[104]  F. Müller,et al.  The multicoloured world of promoter recognition complexes , 2004, The EMBO journal.

[105]  Wyeth W. Wasserman,et al.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles , 2004, Nucleic Acids Res..

[106]  W. Gehring,et al.  Homology of the eyeless gene of Drosophila to the Small eye gene in mice and Aniridia in humans. , 1994, Science.

[107]  Lior Pachter,et al.  MAVID: constrained ancestral alignment of multiple sequences. , 2003, Genome research.

[108]  R. Sorek,et al.  Intronic sequences flanking alternatively spliced exons are conserved between human and mouse. , 2003, Genome research.

[109]  J. T. Kadonaga,et al.  The RNA polymerase II core promoter: a key component in the regulation of gene expression. , 2002, Genes & development.

[110]  S. Carroll,et al.  Molecular mechanisms of selector gene function and evolution. , 2002, Current opinion in genetics & development.

[111]  James T Kadonaga,et al.  The DPE, a core promoter element for transcription by RNA polymerase II , 2002, Experimental & Molecular Medicine.

[112]  C. Fizames,et al.  Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence , 2000, Nature Genetics.

[113]  M. Nóbrega,et al.  Comparative genomics at the vertebrate extremes , 2004, Nature Reviews Genetics.

[114]  A. Wolffe,et al.  Chromatin and transcriptional activity in early Xenopus development. , 1995, Seminars in cell biology.

[115]  Arend Sidow,et al.  Genomic regulatory regions: insights from comparative sequence analysis. , 2003, Current opinion in genetics & development.

[116]  M. Goodman,et al.  Embryonic ε and γ globin genes of a prosimian primate (Galago crassicaudatus): Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints , 1988 .

[117]  F. Grosveld,et al.  Modification of human beta-globin locus PAC clones by homologous recombination in Escherichia coli. , 2000, Nucleic acids research.

[118]  R. Gibbs,et al.  PipMaker--a web server for aligning two genomic DNA sequences. , 2000, Genome research.

[119]  D. Haussler,et al.  Ultraconserved Elements in the Human Genome , 2004, Science.

[120]  S. Brenner,et al.  Fugu and human sequence comparison identifies novel human genes and conserved non-coding sequences. , 2002, Gene.

[121]  Alexandre Reymond,et al.  Evolutionary Discrimination of Mammalian Conserved Non-Genic Sequences (CNGs) , 2003, Science.

[122]  M. Schartl,et al.  Medaka — a model organism from the far east , 2002, Nature Reviews Genetics.

[123]  Lior Pachter,et al.  VISTA: computational tools for comparative genomics , 2004, Nucleic Acids Res..

[124]  Cameron S. Osborne,et al.  Active genes dynamically colocalize to shared sites of ongoing transcription , 2004, Nature Genetics.

[125]  Nancy F. Hansen,et al.  Comparative analyses of multi-species sequences from targeted genomic regions , 2003, Nature.

[126]  Marc S Halfon,et al.  Exploring genetic regulatory networks in metazoan development: methods and models. , 2002, Physiological genomics.

[127]  C. Lawrence,et al.  Human-mouse genome comparisons to locate regulatory sites , 2000, Nature Genetics.

[128]  F. Müller,et al.  A floor plate enhancer of the zebrafish netrin1 gene requires Cyclops (Nodal) signalling and the winged helix transcription factor FoxA2. , 2002, Developmental biology.

[129]  M. Busslinger,et al.  The activation and maintenance of Pax2 expression at the mid-hindbrain boundary is controlled by separate enhancers. , 2002, Development.

[130]  Kenta Nakai,et al.  BTSS, DataBase of Transcriptional Start Sites: progress report 2004 , 2004, Nucleic Acids Res..

[131]  J. Fak,et al.  Transcriptional Control in the Segmentation Gene Network of Drosophila , 2004, PLoS biology.

[132]  S. P. Fodor,et al.  Evolutionarily conserved sequences on human chromosome 21. , 2001, Genome research.

[133]  Lior Pachter,et al.  VISTA : visualizing global DNA sequence alignments of arbitrary length , 2000, Bioinform..

[134]  Michael Levine,et al.  Whole-Genome Analysis of Dorsal-Ventral Patterning in the Drosophila Embryo , 2002, Cell.

[135]  H. Bernard,et al.  Nuclear Matrix Attachment Regions of Human Papillomavirus Type 16 Point toward Conservation of These Genomic Elements in All Genital Papillomaviruses , 1998, Journal of Virology.

[136]  C. Burge,et al.  Vertebrate MicroRNA Genes , 2003, Science.

[137]  J. Wittbrodt,et al.  Medaka and zebrafish, an evolutionary twin study , 2004, Mechanisms of Development.

[138]  O. Hobert,et al.  Genomic cis-regulatory architecture and trans-acting regulators of a single interneuron-specific gene battery in C. elegans. , 2004, Developmental cell.

[139]  Vincent Bertrand,et al.  Neural Tissue in Ascidian Embryos Is Induced by FGF9/16/20, Acting via a Combination of Maternal GATA and Ets Transcription Factors , 2003, Cell.

[140]  Jon D. McAuliffe,et al.  Phylogenetic Shadowing of Primate Sequences to Find Functional Regions of the Human Genome , 2003, Science.

[141]  Massimo Vergassola,et al.  Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo , 2002, BMC Bioinformatics.

[142]  Michael Levine,et al.  Coordinate enhancers share common organizational features in the Drosophila genome. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[143]  M. Levine,et al.  Immunity regulatory DNAs share common organizational features in Drosophila. , 2004, Molecular cell.

[144]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[145]  Jens Stoye,et al.  Benchmarking tools for the alignment of functional noncoding DNA , 2004, BMC Bioinformatics.

[146]  Paramvir S. Dehal,et al.  Whole-Genome Shotgun Assembly and Analysis of the Genome of Fugu rubripes , 2002, Science.

[147]  D. Tautz Evolution of transcriptional regulation. , 2000, Current opinion in genetics & development.

[148]  Anjana Rao,et al.  Bioinformatics for the 'bench biologist': how to find regulatory regions in genomic DNA , 2004, Nature Immunology.

[149]  A. Sandelin,et al.  Identification of conserved regulatory elements by comparative genome analysis , 2003, Journal of biology.

[150]  Francesca Chiaromonte,et al.  Regulatory potential scores from genome-wide three-way alignments of human, mouse, and rat. , 2004, Genome research.