Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data

BackgroundThe increasing interest in small non-coding RNAs (ncRNAs) such as microRNAs (miRNAs), small interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs) and recent advances in sequencing technology have yielded large numbers of short (18-32 nt) RNA sequences from different organisms, some of which are derived from small nucleolar RNAs (snoRNAs) and transfer RNAs (tRNAs). We observed that these short ncRNAs frequently cover the entire length of annotated snoRNAs or tRNAs, which suggests that other loci specifying similar ncRNAs can be identified by clusters of short RNA sequences.ResultsWe combined publicly available datasets of tens of millions of short RNA sequence tags from Drosophila melanogaster, and mapped them to the Drosophila genome. Approximately 6 million perfectly mapping sequence tags were then assembled into 521,302 tag-contigs (TCs) based on tag overlap. Most transposon-derived sequences, exons and annotated miRNAs, tRNAs and snoRNAs are detected by TCs, which show distinct patterns of length and tag-depth for different categories. The typical length and tag-depth of snoRNA-derived TCs was used to predict 7 previously unrecognized box H/ACA and 26 box C/D snoRNA candidates. We also identified one snRNA candidate and 86 loci with a high number of tags that are yet to be annotated, 7 of which have a particular 18mer motif and are located in introns of genes involved in development. A subset of new snoRNA candidates and putative ncRNA candidates was verified by Northern blot.ConclusionsIn this study, we have introduced a new approach to identify new members of known classes of ncRNAs based on the features of TCs corresponding to known ncRNAs. A large number of the identified TCs are yet to be examined experimentally suggesting that many more novel ncRNAs remain to be discovered.

[1]  David Haussler,et al.  The UCSC genome browser database: update 2007 , 2006, Nucleic Acids Res..

[2]  Valer Gotea,et al.  Spliceosomal small nuclear RNA genes in 11 insect genomes. , 2006, RNA.

[3]  Julius Brennecke,et al.  Specialized piRNA Pathways Act in Germline and Somatic Tissues of the Drosophila Ovary , 2009, Cell.

[4]  Eugene Berezikov,et al.  Functionally distinct regulatory RNAs generated by bidirectional transcription and processing of microRNA loci. , 2008, Genes & development.

[5]  D. Bartel,et al.  The Drosophila hairpin RNA pathway generates endogenous short interfering RNAs , 2008, Nature.

[6]  Cliff Han,et al.  Mechanism of induction and suppression of antiviral immunity directed by virus-derived small RNAs in Drosophila. , 2008, Cell host & microbe.

[7]  Jason S. Cumbie,et al.  High-Throughput Sequencing of Arabidopsis microRNAs: Evidence for Frequent Birth and Death of MIRNA Genes , 2007, PloS one.

[8]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[9]  A. Sandelin,et al.  Hidden layers of human small RNAs , 2008, BMC Genomics.

[10]  Mary Goldman,et al.  The UCSC Genome Browser database: update 2011 , 2010, Nucleic Acids Res..

[11]  B. Reinhart,et al.  The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans , 2000, Nature.

[12]  G. Hannon,et al.  The Piwi-piRNA Pathway Provides an Adaptive Defense in the Transposon Arms Race , 2007, Science.

[13]  I. Schneider,et al.  Cell lines derived from late embryonic stages of Drosophila melanogaster. , 1972, Journal of embryology and experimental morphology.

[14]  Patricia P. Chan,et al.  GtRNAdb: a database of transfer RNA genes detected in genomic sequence , 2008, Nucleic Acids Res..

[15]  V. Kim MicroRNA biogenesis: coordinated cropping and dicing , 2005, Nature Reviews Molecular Cell Biology.

[16]  R. Drysdale FlyBase : a database for the Drosophila research community. , 2008, Methods in molecular biology.

[17]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[18]  Z. Weng,et al.  Endogenous siRNAs Derived from Transposons and mRNAs in Drosophila Somatic Cells , 2008, Science.

[19]  A. Hüttenhofer,et al.  The expanding snoRNA world. , 2002, Biochimie.

[20]  N. Perrimon,et al.  An endogenous small interfering RNA pathway in Drosophila , 2008, Nature.

[21]  Eugene Berezikov,et al.  Many novel mammalian microRNA candidates identified by extensive cloning and RAKE analysis. , 2006, Genome research.

[22]  Tamás Kiss,et al.  Site-Specific Ribose Methylation of Preribosomal RNA: A Novel Function for Small Nucleolar RNAs , 1996, Cell.

[23]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[24]  J. Mattick,et al.  Rapid evolution of noncoding RNAs: lack of conservation does not mean lack of function. , 2006, Trends in genetics : TIG.

[25]  Laurie Smith,et al.  The RNA World of the Nucleolus: Two Major Families of Small RNAs Defined by Different Box Elements with Related Functions , 1996, Cell.

[26]  Jürgen Brosius,et al.  RNomics in Drosophila melanogaster: identification of 66 candidates for novel non-messenger RNAs , 2003, Nucleic acids research.

[27]  Peter F. Stadler,et al.  SnoReport: computational identification of snoRNAs with unknown targets , 2008, Bioinform..

[28]  Yu Liang,et al.  BMC Genomics , 2007 .

[29]  David Botstein,et al.  GO: : TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes , 2004, Bioinform..

[30]  E. Ullu,et al.  Alu sequences are processed 7SL RNA genes , 1984, Nature.

[31]  Eugene Berezikov,et al.  Cloning and expression of new microRNAs from zebrafish , 2006, Nucleic acids research.

[32]  Ting Wang,et al.  The UCSC Genome Browser Database: update 2009 , 2008, Nucleic Acids Res..

[33]  Jian Lu,et al.  The birth and death of microRNA genes in Drosophila , 2008, Nature Genetics.

[34]  P. Sharp,et al.  RNAi Double-Stranded RNA Directs the ATP-Dependent Cleavage of mRNA at 21 to 23 Nucleotide Intervals , 2000, Cell.

[35]  P. Stadler,et al.  Arthropod 7SK RNA. , 2008, Molecular biology and evolution.

[36]  Shu-Hsing Wu,et al.  Mining small RNA sequencing data: a new approach to identify small nucleolar RNAs in Arabidopsis , 2009, Nucleic acids research.

[37]  Manolis Kellis,et al.  Evolution, biogenesis, expression, and target predictions of a substantially expanded set of Drosophila microRNAs. , 2007, Genome research.

[38]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[39]  A. Hamilton,et al.  Improved northern blot method for enhanced detection of small RNA , 2008, Nature Protocols.

[40]  N. Rajewsky,et al.  A human snoRNA with microRNA-like functions. , 2008, Molecular cell.

[41]  J. Mattick,et al.  Small RNAs derived from snoRNAs. , 2009, RNA.

[42]  Haifan Lin,et al.  An epigenetic activation role of Piwi and a Piwi-associated piRNA in Drosophila melanogaster , 2007, Nature.

[43]  E. Lai,et al.  Endogenous RNA Interference Provides a Somatic Defense against Drosophila Transposons , 2008, Current Biology.

[44]  A. Marchfelder,et al.  The final cut , 2001 .

[45]  Manolis Kellis,et al.  Discrete Small RNA-Generating Loci as Master Regulators of Transposon Activity in Drosophila , 2007, Cell.

[46]  Taishin Kin,et al.  Drosophila endogenous small RNAs bind to Argonaute 2 in somatic cells , 2008, Nature.

[47]  Hervé Seitz,et al.  Argonaute Loading Improves the 5′ Precision of Both MicroRNAs and Their miRNA∗ Strands in Flies , 2008, Current Biology.