Systematic Curation of miRBase Annotation Using Integrated Small RNA High-Throughput Sequencing Data for C. elegans and Drosophila

MicroRNAs (miRNAs) are a class of 20–23 nucleotide small RNAs that regulate gene expression post-transcriptionally in animals and plants. Annotation of miRNAs by the miRNA database (miRBase) has largely relied on computational approaches. As a result, many miRBase entries lack experimental validation, and discrepancies between miRBase annotation and actual miRNA sequences are often observed. In this study, we integrated the small RNA sequencing (smRNA-seq) datasets in Caenorhabditis elegans and Drosophila melanogaster and devised an analytical pipeline coupled with detailed manual inspection to curate miRNA annotation systematically in miRBase. Our analysis reveals 19 (17.0%) and 51 (31.3%) miRNAs entries with detectable smRNA-seq reads have mature sequence discrepancies in C. elegans and D. melanogaster, respectively. These discrepancies frequently occur either for conserved miRNA families whose mature sequences were predicted according to their homologous counterparts in other species or for miRNAs whose precursor miRNA (pre-miRNA) hairpins produce an abundance of multiple miRNA isoforms or variants. Our analysis shows that while Drosophila pre-miRNAs, on average, produce less than 60% accurate mature miRNA reads in addition to their 5′ and 3′ variant isoforms, the precision of miRNA processing in C. elegans is much higher, at over 90%. Based on the revised miRNA sequences, we analyzed expression patterns of the more conserved (MC) and less conserved (LC) miRNAs and found that, whereas MC miRNAs are often co-expressed at multiple developmental stages, LC miRNAs tend to be expressed specifically at fewer stages.

[1]  Walter Fontana,et al.  Fast folding and comparison of RNA secondary structures , 1994 .

[2]  Y. Tomari,et al.  Drosophila argonaute1 and argonaute2 employ distinct mechanisms for translational repression. , 2009, Molecular cell.

[3]  Zachary Pincus,et al.  Dynamic expression of small non-coding RNAs, including novel microRNAs and piRNAs/21U-RNAs, during Caenorhabditis elegans development , 2009, Genome Biology.

[4]  Sam Griffiths-Jones,et al.  The microRNA Registry , 2004, Nucleic Acids Res..

[5]  Gajendra P. S. Raghava,et al.  Prediction of guide strand of microRNAs from its sequence and secondary structure , 2009, BMC Bioinformatics.

[6]  Gregory J. Hannon,et al.  Sorting of Small RNAs into Arabidopsis Argonaute Complexes Is Directed by the 5′ Terminal Nucleotide , 2008, Cell.

[7]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[8]  Christopher M. Player,et al.  Large-Scale Sequencing Reveals 21U-RNAs and Additional MicroRNAs and Endogenous siRNAs in C. elegans , 2006, Cell.

[9]  N. Perrimon,et al.  Hierarchical rules for Argonaute loading in Drosophila. , 2009, Molecular cell.

[10]  Daniel S. Weld Comparative Analysis , 1987, IJCAI.

[11]  F. Piano,et al.  Large scale sorting of C. elegans embryos reveals the dynamics of small RNA expression , 2009, Nature Methods.

[12]  N. Perrimon,et al.  Comparative analysis of argonaute-dependent small RNA pathways in Drosophila. , 2008, Molecular cell.

[13]  J. Neilson,et al.  Zcchc11-dependent uridylation of microRNA directs cytokine expression , 2009, Nature Cell Biology.

[14]  Hervé Seitz,et al.  Argonaute Loading Improves the 5′ Precision of Both MicroRNAs and Their miRNA∗ Strands in Flies , 2008, Current Biology.

[15]  E. Lai,et al.  Endogenous RNA Interference Provides a Somatic Defense against Drosophila Transposons , 2008, Current Biology.

[16]  E. Sontheimer,et al.  Origins and Mechanisms of miRNAs and siRNAs , 2009, Cell.

[17]  宁北芳,et al.  疟原虫var基因转换速率变化导致抗原变异[英]/Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A , 2005 .

[18]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[19]  Peter F. Stadler,et al.  Evidence for human microRNA-offset RNAs in small RNA sequencing data , 2009, Bioinform..

[20]  Xuemei Chen,et al.  Small RNA metabolism in Arabidopsis. , 2008, Trends in plant science.

[21]  E. Lai,et al.  Distinct mechanisms for microRNA strand selection by Drosophila Argonautes. , 2009, Molecular cell.

[22]  Michael Kertesz,et al.  The role of site accessibility in microRNA target recognition , 2007, Nature Genetics.

[23]  Pamela J Green,et al.  Uridylation of mature miRNAs and siRNAs by the MUT68 nucleotidyltransferase promotes their degradation in Chlamydomonas , 2010, Proceedings of the National Academy of Sciences.

[24]  N. Perrimon,et al.  An endogenous small interfering RNA pathway in Drosophila , 2008, Nature.

[25]  Debora S. Marks,et al.  Antisense-Mediated Depletion Reveals Essential and Specific Functions of MicroRNAs in Drosophila Development , 2005, Cell.

[26]  Zhiping Weng,et al.  Target RNA–Directed Trimming and Tailing of Small Silencing RNAs , 2010, Science.

[27]  C. Joo,et al.  Lin28 mediates the terminal uridylation of let-7 precursor MicroRNA. , 2008, Molecular cell.

[28]  Stijn van Dongen,et al.  miRBase: microRNA sequences, targets and gene nomenclature , 2005, Nucleic Acids Res..

[29]  Zhaorong Ma,et al.  Arabidopsis lyrata Small RNAs: Transient MIRNA and Small Interfering RNA Loci within the Arabidopsis Genus[W][OA] , 2010, Plant Cell.

[30]  Peer Bork,et al.  Ancient animal microRNAs and the evolution of tissue identity , 2010, Nature.

[31]  Wing Hung Wong,et al.  SeqMap: mapping massive amount of oligonucleotides to the genome , 2008, Bioinform..

[32]  Alessandra Conversi,et al.  Comparative Analysis , 2009, Encyclopedia of Database Systems.

[33]  Katsutomo Okamura,et al.  The evolution and functional diversification of animal microRNA genes , 2008, Cell Research.

[34]  C. Sullivan,et al.  MicroRNA Gene Evolution in Arabidopsis lyrata and Arabidopsis thaliana[W][OA] , 2010, Plant Cell.