Ancient duplicated conserved noncoding elements in vertebrates: a genomic and functional analysis.

Fish-mammal genomic comparisons have proved powerful in identifying conserved noncoding elements likely to be cis-regulatory in nature, and the majority of those tested in vivo have been shown to act as tissue-specific enhancers associated with genes involved in transcriptional regulation of development. Although most of these elements share little sequence identity to each other, a small number are remarkably similar and appear to be the product of duplication events. Here, we searched for duplicated conserved noncoding elements in the human genome, using comparisons with Fugu to select putative cis-regulatory sequences. We identified 124 families of duplicated elements, each containing between two and five members, that are highly conserved within and between vertebrate genomes. In 74% of cases, we were able to assign a specific set of paralogous genes with annotation relating to transcriptional regulation and/or development to each family, thus removing much of the ambiguity in identifying associated genes. We find that duplicate elements have the potential to up-regulate reporter gene expression in a tissue-specific manner and that expression domains often overlap, but are not necessarily identical, between family members. Over two thirds of the families are conserved in duplicate in fish and appear to predate the large-scale duplication events thought to have occurred at the origin of vertebrates. We propose a model whereby gene duplication and the evolution of cis-regulatory elements can be considered in the context of increased morphological diversity and the emergence of the modern vertebrate body plan.

[1]  C. Sagerström,et al.  A novel subfamily of zinc finger genes involved in embryonic development , 2004, Journal of cellular biochemistry.

[2]  H. Sive,et al.  zic Gene expression marks anteroposterior pattern in the presumptive neurectoderm of the zebrafish gastrula , 2001, Developmental dynamics : an official publication of the American Association of Anatomists.

[3]  A. Sidow,et al.  Journal of Structural and Functional Genomics 3: 45–52, 2003. © 2003 Kluwer Academic Publishers. Printed in the Netherlands. , 2022 .

[4]  L. Hood,et al.  A Genomic Regulatory Network for Development , 2002, Science.

[5]  G. Schlosser,et al.  Molecular anatomy of placode development in Xenopus laevis. , 2004, Developmental biology.

[6]  P. Sharpe,et al.  Dynamic expression of chicken Sox2 and Sox3 genes in ectoderm induced to form neural tissue , 1997, Developmental dynamics : an official publication of the American Association of Anatomists.

[7]  David Haussler,et al.  Into the heart of darkness: large-scale clustering of human non-coding DNA , 2004, ISMB/ECCB.

[8]  E. Birney,et al.  Comparative genomics: genome-wide analysis in metazoan eukaryotes , 2003, Nature Reviews Genetics.

[9]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[10]  A. Hughes,et al.  Pattern of divergence of amino acid sequences encoded by paralogous genes in human and pufferfish. , 2004, Molecular phylogenetics and evolution.

[11]  Sean R. Eddy,et al.  Rfam: an RNA family database , 2003, Nucleic Acids Res..

[12]  A. Sidow,et al.  Gene duplications and the origins of vertebrate development. , 1994, Development (Cambridge, England). Supplement.

[13]  H. Betz,et al.  Neurexins are differentially expressed in the embryonic nervous system of mice , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[14]  J. Piatigorsky,et al.  Orientation-dependent Influence of an Intergenic Enhancer on the Promoter Activity of the Divergently Transcribed Mouse Shsp/αB-crystallin andMkbp/HspB2 Genes* , 2002, The Journal of Biological Chemistry.

[15]  K. Mikoshiba,et al.  The expression of the mouse Zic1, Zic2, and Zic3 gene suggests an essential role for Zic genes in body pattern formation. , 1997, Developmental biology.

[16]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[17]  Denis Duboule,et al.  A Global Control Region Defines a Chromosomal Regulatory Landscape Containing the HoxD Cluster , 2003, Cell.

[18]  Paul Richardson,et al.  The Draft Genome of Ciona intestinalis: Insights into Chordate and Vertebrate Origins , 2002, Science.

[19]  M. Sigvardsson,et al.  The EBF/Olf/Collier Family of Transcription Factors: Regulators of Differentiation in Cells Originating from All Three Embryonal Germ Layers , 2002, Molecular and Cellular Biology.

[20]  Anton J. Enright,et al.  Estimation of Synteny Conservation and Genome Compaction Between Pufferfish (Fugu) and Human , 2000, Yeast.

[21]  M. Nóbrega,et al.  In vivo characterization of a vertebrate ultraconserved enhancer. , 2005, Genomics.

[22]  Webb Miller,et al.  Evolution and functional classification of vertebrate gene deserts. , 2005, Genome research.

[23]  A. Meyer,et al.  Genome duplication, a trait shared by 22000 species of ray-finned fish. , 2003, Genome research.

[24]  Damian Smedley,et al.  Ensembl 2005 , 2004, Nucleic Acids Res..

[25]  D. Kleinjan,et al.  Long-range control of gene expression: emerging mechanisms and disruption in disease. , 2005, American journal of human genetics.

[26]  C. Plessy,et al.  Expression profiling and comparative genomics identify a conserved regulatory region controlling midline expression in the zebrafish embryo. , 2004, Genome research.

[27]  R. Y. Tsai,et al.  Identification of DNA Recognition Sequences and Protein Interaction Domains of the Multiple-Zn-Finger Protein Roaz , 1998, Molecular and Cellular Biology.

[28]  D. Haussler,et al.  Article Identification and Characterization of Multi-Species Conserved Sequences , 2022 .

[29]  Uwe Strähle,et al.  Multiple regulatory elements with spatially and temporally distinct activities control neurogenin1 expression in primary neurons of the zebrafish embryo , 2003, Mechanisms of Development.

[30]  D. Haussler,et al.  Ultraconserved Elements in the Human Genome , 2004, Science.

[31]  Y. Yan,et al.  A comparative map of the zebrafish genome. , 2000, Genome research.

[32]  Alexandre Reymond,et al.  Evolutionary Discrimination of Mammalian Conserved Non-Genic Sequences (CNGs) , 2003, Science.

[33]  C. Seoighe Turning the clock back on ancient genome duplication. , 2003, Current opinion in genetics & development.

[34]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[35]  Wei Qian,et al.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. , 2000, Molecular biology and evolution.

[36]  J. Bowles,et al.  Phylogeny of the SOX family of developmental transcription factors based on sequence and structural indicators. , 2000, Developmental biology.

[37]  Anton J. Enright,et al.  An efficient algorithm for large-scale detection of protein families. , 2002, Nucleic acids research.

[38]  E. Senba,et al.  Expression pattern of the winged-helix/forkhead transcription factor Foxp1 in the developing central nervous system. , 2003, Gene expression patterns : GEP.

[39]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[40]  Judith D. Cohn,et al.  The sequence and analysis of duplication-rich human chromosome 16 , 2004, Nature.

[41]  M. I. Lomax,et al.  Expression of ZIC genes in the development of the chick inner ear and nervous system , 2003, Developmental dynamics : an official publication of the American Association of Anatomists.

[42]  Q. Long,et al.  Expression and regulation of mouse Mtsh1 during limb and branchial arch development , 2001, Developmental dynamics : an official publication of the American Association of Anatomists.

[43]  Alan Christoffels,et al.  Fugu genome analysis provides evidence for a whole-genome duplication early during the evolution of ray-finned fishes. , 2004, Molecular biology and evolution.

[44]  Bin Wang,et al.  Foxp1 regulates cardiac outflow tract, endocardial cushion morphogenesis and myocyte proliferation and maturation , 2004, Development.

[45]  Charles E. Chapple,et al.  Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype , 2004, Nature.

[46]  Karsten Hokamp,et al.  Extensive genomic duplication during early chordate evolution , 2002, Nature Genetics.

[47]  Ching-Ling Lien,et al.  Cardiac-specific activity of an Nkx2-5 enhancer requires an evolutionarily conserved Smad binding site. , 2002, Developmental biology.

[48]  B. Kerr,et al.  Domain disruption and mutation of the bZIP transcription factor, MAF, associated with cataract, ocular anterior segment dysgenesis and coloboma. , 2002, Human molecular genetics.

[49]  A. Force,et al.  Preservation of duplicate genes by complementary, degenerative mutations. , 1999, Genetics.

[50]  B. Oostra,et al.  A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. , 2003, Human molecular genetics.

[51]  X. Caubit,et al.  Expression patterns of the three Teashirt‐related genes define specific boundaries in the developing and postnatal mouse forebrain , 2005, The Journal of comparative neurology.

[52]  A. Brändli,et al.  Xenopus Pax-2/5/8 orthologues: novel insights into Pax gene evolution and identification of Pax-8 as the earliest marker for otic and pronephric cell lineages. , 1999, Developmental genetics.

[53]  Klaas Vandepoele,et al.  Major events in the genome evolution of vertebrates: paranome age and size differ considerably between ray-finned fishes and land vertebrates. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[54]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[55]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[56]  D. Ledbetter,et al.  Diverse fates of paralogs following segmental duplication of telomeric genes. , 2004, Genomics.

[57]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[58]  Klaudia Walter,et al.  Open access, freely available online PLoS BIOLOGY Highly Conserved Non-Coding Sequences Are Associated with Vertebrate Development , 2022 .

[59]  Boris Lenhard,et al.  Arrays of ultraconserved non-coding regions span the loci of key developmental genes in vertebrate genomes , 2004, BMC Genomics.

[60]  Y. Hayashizaki,et al.  Identification of a developmentally regulated striatum-enriched zinc-finger gene, Nolz-1, in the mammalian brain. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[61]  Paramvir S. Dehal,et al.  Whole-Genome Shotgun Assembly and Analysis of the Genome of Fugu rubripes , 2002, Science.

[62]  A. Hata,et al.  Early hematopoietic zinc finger protein (EHZF), the human homolog to mouse Evi3, is highly expressed in primitive human hematopoietic cells. , 2004, Blood.

[63]  E. Davidson,et al.  The hardwiring of development: organization and function of genomic regulatory systems. , 1997, Development.

[64]  Paul A. Overbeek,et al.  A transgenic insertion upstream of Sox9 is associated with dominant XX sex reversal in the mouse , 2000, Nature Genetics.

[65]  R. Toyama,et al.  Sequence relationships and expression patterns of zebrafish zic2 and zic5 genes. , 2004, Gene expression patterns : GEP.

[66]  Tanya Vavouri,et al.  Defining a genomic radius for long-range enhancer action: duplicated conserved non-coding elements hold the key. , 2006, Trends in genetics : TIG.

[67]  M. Nóbrega,et al.  Comparative genomics at the vertebrate extremes , 2004, Nature Reviews Genetics.

[68]  S. Aizawa,et al.  Characterization of the pufferfish Otx2 cis-regulators reveals evolutionarily conserved genetic mechanisms for vertebrate head specification , 2004, Development.

[69]  J. Tena,et al.  A functional survey of the enhancer activity of conserved non-coding sequences from vertebrate Iroquois cluster gene deserts. , 2005, Genome research.

[70]  Eivind Coward,et al.  Shufflet: shuffling sequences while conserving the k-let counts , 1999, Bioinform..

[71]  Berthold Göttgens,et al.  Regulation of the stem cell leukemia (SCL) gene: A tale of two fishes , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[72]  J. Gécz,et al.  Mutations in the DLG3 gene cause nonsyndromic X-linked mental retardation. , 2004, American journal of human genetics.

[73]  M. Ekker,et al.  A Highly Conserved Enhancer in the Dlx5/Dlx6Intergenic Region is the Site of Cross-Regulatory Interactions betweenDlx Genes in the Embryonic Forebrain , 2000, The Journal of Neuroscience.

[74]  R. Wides,et al.  The mammalian Odz gene family: homologs of a Drosophila pair-rule gene with expression implying distinct yet overlapping developmental roles. , 2000, Developmental biology.

[75]  M. Nóbrega,et al.  Scanning Human Gene Deserts for Long-Range Enhancers , 2003, Science.

[76]  Toshiya Yamada,et al.  The HMG box transcription factor gene Sox14 marks a novel subset of ventral interneurons and is regulated by sonic hedgehog. , 2000, Developmental biology.

[77]  Karen P. Steel,et al.  Sox2 is required for sensory organ development in the mammalian inner ear , 2005, Nature.

[78]  Martin Vingron,et al.  New evidence for genome-wide duplications at the origin of vertebrates using an amphioxus gene set and completed animal genomes. , 2003, Genome research.

[79]  J. Lehoczky,et al.  Conserved expression domains for genes upstream and within the HoxA and HoxD clusters suggests a long‐range enhancer existed before cluster duplication , 2004, Evolution & development.

[80]  Sam Griffiths-Jones,et al.  The microRNA Registry , 2004, Nucleic Acids Res..

[81]  W. Knöchel,et al.  The FoxO-subclass in Xenopus laevis development. , 2004, Gene expression patterns : GEP.

[82]  C. Chien,et al.  Molecular cloning and developmental expression of foxP2 in zebrafish , 2005, Developmental dynamics : an official publication of the American Association of Anatomists.

[83]  Sarah F. Smith,et al.  Highly conserved regulatory elements around the SHH gene may contribute to the maintenance of conserved synteny across human chromosome 7q36.3. , 2005, Genomics.

[84]  X. Caubit,et al.  Three putative murine Teashirt orthologues specify trunk structures in Drosophila in the same way as the Drosophila teashirt gene , 2004, Development.

[85]  H. Nick,et al.  An internal enhancer regulates heme- and cadmium-mediated induction of human heme oxygenase-1. , 2003, American journal of physiology. Renal physiology.

[86]  R. Krumlauf,et al.  Selectivity, sharing and competitive interactions in the regulation of Hoxb genes , 1998, The EMBO journal.