Reliable prediction of Drosha processing sites improves microRNA gene prediction.

MOTIVATION Mature microRNAs (miRNAs) are processed from long hairpin transcripts. Even though it is only the first of several steps, the initial Drosha processing defines the mature product and is characteristic for all miRNA genes. Methods that can separate between true and false processing sites are therefore essential to miRNA gene discovery. RESULTS We present a classifier that predicts 5' Drosha processing sites in hairpins that are candidate miRNAs. The classifier, called Microprocessor SVM, correctly predicts the processing site for 50% of known human 5' miRNAs, and 90% of its predictions are within two nucleotides of the true site. Another classifier that is trained on the output from the Microprocessor SVM outperforms existing methods for prediction of unconserved miRNAs. Reanalysis of characteristics and supporting evidence for a set of newly annotated miRNAs shows that some miRNAs may be misannotated. This suggests that expressed hairpins should not be annotated as miRNAs until they are verified to be Drosha and Dicer substrates. AVAILABILITY The classifiers are publicly available at https://demo1.interagon.com/miRNA/

[1]  B. Cullen,et al.  Recognition and cleavage of primary microRNA precursors by the nuclear processing enzyme Drosha , 2005, The EMBO journal.

[2]  T. Du,et al.  Asymmetry in the Assembly of the RNAi Enzyme Complex , 2003, Cell.

[3]  J. Krol,et al.  Structural Features of MicroRNA (miRNA) Precursors and Their Relevance to miRNA Biogenesis and Small Interfering RNA/Short Hairpin RNA Design* , 2004, Journal of Biological Chemistry.

[4]  K. Czaplinski,et al.  Exportin 5 is a RanGTP-dependent dsRNA-binding protein that mediates nuclear export of pre-miRNAs. , 2004, RNA.

[5]  Isaac Bentwich Available online , 2005 .

[6]  C. Burge,et al.  Patterns of flanking sequence conservation and a characteristic upstream motif for microRNA gene identification. , 2004, RNA.

[7]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[8]  G. Hutvagner,et al.  A microRNA in a Multiple-Turnover RNAi Enzyme Complex , 2002, Science.

[9]  Eugene Berezikov,et al.  Approaches to microRNA discovery , 2006, Nature Genetics.

[10]  Fei Li,et al.  Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine , 2005, BMC Bioinformatics.

[11]  G. Rubin,et al.  Computational identification of Drosophila microRNA genes , 2003, Genome Biology.

[12]  Sanghyuk Lee,et al.  MicroRNA genes are transcribed by RNA polymerase II , 2004, The EMBO journal.

[13]  Hanah Margalit,et al.  Clustering and conservation patterns of human microRNAs , 2005, Nucleic acids research.

[14]  R. Aharonov,et al.  Identification of hundreds of conserved and nonconserved human microRNAs , 2005, Nature Genetics.

[15]  Byoung-Tak Zhang,et al.  Human microRNA prediction through a probabilistic co-learning model of sequence and structure , 2005, Nucleic acids research.

[16]  H. Horvitz,et al.  MicroRNA expression profiles classify human cancers , 2005, Nature.

[17]  M. Mann,et al.  miRNPs: a novel class of ribonucleoproteins containing numerous microRNAs. , 2002, Genes & development.

[18]  Zuoshang Xu,et al.  An RNA polymerase II construct synthesizes short-hairpin RNA with a quantitative indicator and mediates highly efficient RNAi , 2005, Nucleic acids research.

[19]  B. Cullen,et al.  Structural requirements for pre-microRNA binding and nuclear export by Exportin 5. , 2004, Nucleic acids research.

[20]  A. Caudy,et al.  Role for a bidentate ribonuclease in the initiation step of RNA interference , 2001 .

[21]  A. Pasquinelli,et al.  A Cellular Function for the RNA-Interference Enzyme Dicer in the Maturation of the let-7 Small Temporal RNA , 2001, Science.

[22]  Sam Griffiths-Jones,et al.  The microRNA Registry , 2004, Nucleic Acids Res..

[23]  B. Cullen,et al.  Sequence requirements for micro RNA processing and function in human cells. , 2003, RNA.

[24]  R. Shiekhattar,et al.  The Microprocessor complex mediates the genesis of microRNAs , 2004, Nature.

[25]  R. Bernards,et al.  A System for Stable Expression of Short Interfering RNAs in Mammalian Cells , 2002, Science.

[26]  C. Burge,et al.  The microRNAs of Caenorhabditis elegans. , 2003, Genes & development.

[27]  Ola Snøve,et al.  Conserved microRNA characteristics in mammals. , 2006, Oligonucleotides.

[28]  B. Li,et al.  Expression profiling reveals off-target gene regulation by RNAi , 2003, Nature Biotechnology.

[29]  G. Hannon,et al.  Processing of primary microRNAs by the Microprocessor complex , 2004, Nature.

[30]  Mihaela Zavolan,et al.  Identification of Clustered Micrornas Using an Ab Initio Prediction Method , 2022 .

[31]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[32]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[33]  F. Slack,et al.  Oncomirs — microRNAs with a role in cancer , 2006, Nature Reviews Cancer.

[34]  V. Ambros The functions of animal microRNAs , 2004, Nature.

[35]  William Stafford Noble,et al.  Support vector machine classification on the web , 2004, Bioinform..

[36]  A. Caudy,et al.  Argonaute2, a Link Between Genetic and Biochemical Analyses of RNAi , 2001, Science.

[37]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[38]  U. Kutay,et al.  Nuclear Export of MicroRNA Precursors , 2004, Science.

[39]  Brian S. Roberts,et al.  The colorectal microRNAome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[40]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[41]  V. Kim,et al.  The nuclear RNase III Drosha initiates microRNA processing , 2003, Nature.

[42]  Eric J Wagner,et al.  Both natural and designed micro RNAs can inhibit the expression of cognate mRNAs when expressed in human cells. , 2002, Molecular cell.

[43]  A. Reynolds,et al.  The contributions of dsRNA structure to Dicer specificity and efficiency. , 2005, RNA.

[44]  Byoung-Tak Zhang,et al.  Molecular Basis for the Recognition of Primary microRNAs by the Drosha-DGCR8 Complex , 2006, Cell.

[45]  Bernhard Schölkopf,et al.  Support vector learning , 1997 .

[46]  R. Russell,et al.  Principles of MicroRNA–Target Recognition , 2005, PLoS biology.

[47]  G. Hannon,et al.  miRNAs on the move: miRNA biogenesis and the RNAi machinery. , 2004, Current opinion in cell biology.

[48]  G. Ruvkun,et al.  A uniform system for microRNA annotation. , 2003, RNA.

[49]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.