Thousands of human mobile element fragments undergo strong purifying selection near developmental genes

At least 5% of the human genome predating the mammalian radiation is thought to have evolved under purifying selection, yet protein-coding and related untranslated exons occupy at most 2% of the genome. Thus, the majority of conserved and, by extension, functional sequence in the human genome seems to be nonexonic. Recent work has highlighted a handful of cases where mobile element insertions have resulted in the introduction of novel conserved nonexonic elements. Here, we present a genome-wide survey of 10,402 constrained nonexonic elements in the human genome that have all been deposited by characterized mobile elements. These repeat instances have been under strong purifying selection since at least the boreoeutherian ancestor (100 Mya). They are most often located in gene deserts and show a strong preference for residing closest to genes involved in development and transcription regulation. In particular, constrained nonexonic elements with clear repetitive origins are located near genes involved in cell adhesion, including all characterized cellular members of the reelin-signaling pathway. Overall, we find that mobile elements have contributed at least 5.5% of all constrained nonexonic elements unique to mammals, suggesting that mobile elements may have played a larger role than previously recognized in shaping and specializing the landscape of gene regulation during mammalian evolution.

[1]  Justin Johnson,et al.  Ancient Noncoding Elements Conserved in the Human Genome , 2006, Science.

[2]  Colin N. Dewey,et al.  Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution , 2004, Nature.

[3]  B. Mcclintock,et al.  Controlling elements and the gene. , 1956, Cold Spring Harbor symposia on quantitative biology.

[4]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[5]  E. Lander,et al.  A large family of ancient repeat elements in the human genome is under strong selection. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[6]  B. Howell,et al.  disabled-1 Functions Cell Autonomously during Radial Migration and Cortical Layering of Pyramidal Neurons , 2001, The Journal of Neuroscience.

[7]  David Haussler,et al.  Into the heart of darkness: large-scale clustering of human non-coding DNA , 2004, ISMB/ECCB.

[8]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[9]  E. Davidson,et al.  Gene Regulatory Networks and the Evolution of Animal Body Plans , 2006, Science.

[10]  D. Kleinjan,et al.  Long-range control of gene expression: emerging mechanisms and disruption in disease. , 2005, American journal of human genetics.

[11]  D. Haussler,et al.  Ultraconserved Elements in the Human Genome , 2004, Science.

[12]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[13]  Noam Shomron,et al.  The Birth of an Alternatively Spliced Exon: 3' Splice-Site Selection in Alu Exons , 2003, Science.

[14]  Alan M. Moses,et al.  In vivo enhancer analysis of human conserved non-coding sequences , 2006, Nature.

[15]  宁北芳,et al.  疟原虫var基因转换速率变化导致抗原变异[英]/Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A , 2005 .

[16]  P. Deininger,et al.  Identification of a New Subclass of Alu DNA Repeats Which Can Function as Estrogen Receptor-dependent Transcriptional Enhancers (*) , 1995, The Journal of Biological Chemistry.

[17]  Christian Biémont,et al.  Genetics: Junk DNA as an evolutionary force , 2006, Nature.

[18]  Valer Gotea,et al.  Transposable elements as a significant source of transcription regulating signals. , 2006, Gene.

[19]  Paul T. Groth,et al.  The ENCODE (ENCyclopedia Of DNA Elements) Project , 2004, Science.

[20]  S. Batzoglou,et al.  Distribution and intensity of constraint in mammalian genomic sequence. , 2005, Genome research.

[21]  J. Brosius The Contribution of RNAs and Retroposition to Evolutionary Novelties , 2003, Genetica.

[22]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[23]  A. Smit Interspersed repeats and other mementos of transposable elements in mammalian genomes. , 1999, Current opinion in genetics & development.

[24]  Klaudia Walter,et al.  Open access, freely available online PLoS BIOLOGY Highly Conserved Non-Coding Sequences Are Associated with Vertebrate Development , 2022 .

[25]  N. Okada,et al.  Characterization of novel Alu- and tRNA-related SINEs from the tree shrew and evolutionary implications of their origins. , 2002, Molecular biology and evolution.

[26]  R. Lawn,et al.  Apolipoprotein(a) Gene Enhancer Resides within a LINE Element* , 1998, The Journal of Biological Chemistry.

[27]  J. V. Moran,et al.  Mobile elements and mammalian genome evolution. , 2003, Current opinion in genetics & development.

[28]  A. Goffinet,et al.  Reelin and brain development , 2003, Nature Reviews Neuroscience.

[29]  J. Brosius,et al.  RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. , 1999, Gene.

[30]  Terrence S. Furey,et al.  The UCSC Genome Browser Database: update 2006 , 2005, Nucleic Acids Res..

[31]  M. King,et al.  Evolution at two levels in humans and chimpanzees. , 1975, Science.

[32]  S. Gould,et al.  Exaptation—a Missing Term in the Science of Form , 1982, Paleobiology.

[33]  D. Haussler,et al.  A distal enhancer and an ultraconserved exon are derived from a novel retroposon , 2006, Nature.

[34]  A. Smit,et al.  Functional noncoding sequences derived from SINEs in the mammalian genome. , 2006, Genome research.

[35]  R. Britten,et al.  Repetitive and Non-Repetitive DNA Sequences and a Speculation on the Origins of Evolutionary Novelty , 1971, The Quarterly Review of Biology.

[36]  David G. Harris,et al.  Conserved fragments of transposable elements in intergenic regions: evidence for widespread recruitment of MIR- and L2-derived sequences within the mouse and human genomes. , 2003, Genetical research.

[37]  Chris P. Ponting,et al.  Genome-Wide Identification of Human Functional DNA Using a Neutral Indel Model , 2005, PLoS Comput. Biol..

[38]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[39]  Michael Pheasant,et al.  Transposon-free regions in mammalian genomes. , 2005, Genome research.

[40]  J. Jurka Repbase update: a database and an electronic journal of repetitive elements. , 2000, Trends in genetics : TIG.

[41]  Shyam Prabhakar,et al.  Close sequence comparisons are sufficient to identify human cis-regulatory elements. , 2005, Genome research.

[42]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[43]  Alexander E. Kel,et al.  TRANSFAC®: transcriptional regulation, from patterns to profiles , 2003, Nucleic Acids Res..

[44]  J. Lieb,et al.  ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. , 2004, Genomics.

[45]  A. Reymond,et al.  Conserved non-genic sequences — an unexpected feature of mammalian genomes , 2005, Nature Reviews Genetics.

[46]  S. Carroll,et al.  Evolution at Two Levels: On Genes and Form , 2005, PLoS biology.

[47]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .