The evolution of lncRNA repertoires and expression patterns in tetrapods

Only a very small fraction of long noncoding RNAs (lncRNAs) are well characterized. The evolutionary history of lncRNAs can provide insights into their functionality, but the absence of lncRNA annotations in non-model organisms has precluded comparative analyses. Here we present a large-scale evolutionary study of lncRNA repertoires and expression patterns, in 11 tetrapod species. We identify approximately 11,000 primate-specific lncRNAs and 2,500 highly conserved lncRNAs, including approximately 400 genes that are likely to have originated more than 300 million years ago. We find that lncRNAs, in particular ancient ones, are in general actively regulated and may function predominantly in embryonic development. Most lncRNAs evolve rapidly in terms of sequence and expression levels, but tissue specificities are often conserved. We compared expression patterns of homologous lncRNA and protein-coding families across tetrapods to reconstruct an evolutionarily conserved co-expression network. This network suggests potential functions for lncRNAs in fundamental processes such as spermatogenesis and synaptic transmission, but also in more specific mechanisms such as placenta development through microRNA production.

[1]  Cole Trapnell,et al.  Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. , 2011, Genes & development.

[2]  Albert J. Vilella,et al.  A high-resolution map of human evolutionary constraint using 29 mammals , 2011, Nature.

[3]  Stijn van Dongen,et al.  Using MCL to extract clusters from networks. , 2012, Methods in molecular biology.

[4]  C. Ponting,et al.  Long noncoding RNA genes: conservation of sequence and brain expression among diverse amniotes , 2010, Genome Biology.

[5]  G. Elgar,et al.  Rsx is a metatherian RNA with Xist-like properties in X-chromosome inactivation , 2012, Nature.

[6]  Michael F. Lin,et al.  Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals , 2009, Nature.

[7]  David Haussler,et al.  The UCSC Genome Browser database: update 2010 , 2009, Nucleic Acids Res..

[8]  D. Bartel,et al.  Long noncoding RNAs in C. elegans , 2012, Genome research.

[9]  S. Sunkin,et al.  Specific expression of long noncoding RNAs in the mouse brain , 2008, Proceedings of the National Academy of Sciences.

[10]  M. Kyba,et al.  The H19 lincRNA is a developmental reservoir of miR-675 which suppresses growth and Igf1r , 2012, Nature Cell Biology.

[11]  Hazel Sive,et al.  Coherent but overlapping expression of microRNAs and their targets during vertebrate development. , 2009, Genes & development.

[12]  S. Salzberg,et al.  The Transcriptional Landscape of the Mammalian Genome , 2005, Science.

[13]  Manolis Kellis,et al.  Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions , 2012, Science.

[14]  David Z. Chen,et al.  Architecture of the human regulatory network derived from ENCODE data , 2012, Nature.

[15]  D. Cacchiarelli,et al.  A Long Noncoding RNA Controls Muscle Differentiation by Functioning as a Competing Endogenous RNA , 2011, Cell.

[16]  A. Gylfason,et al.  Fine-scale recombination rate differences between sexes, populations and individuals , 2010, Nature.

[17]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[18]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[19]  L. Duret,et al.  Adaptation or biased gene conversion? Extending the null hypothesis of molecular evolution. , 2007, Trends in genetics : TIG.

[20]  Chris P. Ponting,et al.  Identification and Properties of 1,119 Candidate LincRNA Loci in the Drosophila melanogaster Genome , 2012, Genome biology and evolution.

[21]  J. Silberg,et al.  A transposase strategy for creating libraries of circularly permuted proteins , 2012, Nucleic acids research.

[22]  D. Haussler,et al.  Aligning multiple genomic sequences with the threaded blockset aligner. , 2004, Genome research.

[23]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[24]  Chris P. Ponting,et al.  Rapid Turnover of Long Noncoding RNAs and the Evolution of Gene Expression , 2012, PLoS genetics.

[25]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[26]  Bonnie Berger,et al.  Methods in Comparative Genomics: Genome Correspondence, Gene Identification and Regulatory Motif Discovery , 2004, J. Comput. Biol..

[27]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[28]  W. Stanford,et al.  PCL2 modulates gene regulatory networks controlling self-renewal and commitment in embryonic stem cells , 2011, Cell cycle.

[29]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[30]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[31]  Paulo P. Amaral,et al.  Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. , 2008, Genome research.

[32]  Ana Kozomara,et al.  miRBase: integrating microRNA annotation and deep-sequencing data , 2010, Nucleic Acids Res..

[33]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[34]  R. Nielsen,et al.  Patterns of Positive Selection in Six Mammalian Genomes , 2008, PLoS genetics.

[35]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[36]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[37]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[38]  Albert J. Vilella,et al.  EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. , 2009, Genome research.

[39]  J. Rinn,et al.  Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs , 2010, Nature Biotechnology.

[40]  D. Barlow,et al.  Quantitative genetics: Turning up the heat on QTL mapping , 2002, Nature Reviews Genetics.

[41]  Joel Dudley,et al.  TimeTree: a public knowledge-base of divergence times among organisms , 2006, Bioinform..

[42]  Carolyn J. Brown,et al.  Silencing of the mammalian X chromosome. , 2005, Annual review of genomics and human genetics.

[43]  T. Mikkelsen,et al.  Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. , 2013, Cell reports.

[44]  J. Rinn,et al.  Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs , 2010, Nature biotechnology.

[45]  Howard Y. Chang,et al.  Functional Demarcation of Active and Silent Chromatin Domains in Human HOX Loci by Noncoding RNAs , 2007, Cell.

[46]  S. Bergmann,et al.  The evolution of gene expression levels in mammalian organs , 2011, Nature.

[47]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[48]  T. Derrien,et al.  Long Noncoding RNAs with Enhancer-like Function in Human Cells , 2010, Cell.

[49]  Doron Lancet,et al.  Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification , 2005, Bioinform..

[50]  A. J. Schroeder,et al.  Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes. , 2007, Genome research.

[51]  J. Rinn,et al.  Many human large intergenic noncoding RNAs associate with chromatin-modifying complexes and affect gene expression , 2009, Proceedings of the National Academy of Sciences.

[52]  Ian Chambers,et al.  The transcriptional foundation of pluripotency , 2009, Development.

[53]  D. Bartel,et al.  Conserved Function of lincRNAs in Vertebrate Embryonic Development despite Rapid Sequence Evolution , 2011, Cell.

[54]  Michael D. Wilson,et al.  Five-Vertebrate ChIP-seq Reveals the Evolutionary Dynamics of Transcription Factor Binding , 2010, Science.

[55]  Martin S. Taylor,et al.  Genome-wide analysis of mammalian promoter architecture and evolution , 2006, Nature Genetics.

[56]  C. Ponting,et al.  Catalogues of mammalian long noncoding RNAs: modest conservation and incompleteness , 2009, Genome Biology.