Recognizing drosha processing sites by a two-step prediction model with structure and sequence information

Drosha is a class of RNase III enzyme plays important roles in the microRNA (miRNA) generation by cleaving primary miRNAs to release hairpin-shaped miRNA precursors. Accurately predicting the Drosha cleavage positions (i.e., processing sites) is helpful for the identification of miRNAs and the understanding of miRNA biogenesis mechanisms. In this study, we presented a Drosha processing site predictor, termed DroshaPSP, with a two-step prediction model by integrating structure and sequence features. Testing results on the Drosophila melanogaster miRNA data showed that DroshaPSP obtained a sensitivity of 0.859, a specificity of 0.999, and a Matthew's Correlation Coefficient of 0.864. We also found that the Shannon entropy is a powerful structure feature for DroshaPSP to distinguish true Drosha processing sites from the nearby pseudo processing sites effectively.

[1]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[2]  Vincent Moulton,et al.  A comparison of RNA folding measures , 2005, BMC Bioinformatics.

[3]  V. Kim,et al.  The nuclear RNase III Drosha initiates microRNA processing , 2003, Nature.

[4]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[5]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[6]  Ola R. Snøve,et al.  Reliable prediction of Drosha processing sites improves microRNA gene prediction. , 2007, Bioinformatics.

[7]  J. Castle,et al.  Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs , 2005, Nature.

[8]  V. Kim,et al.  The Drosha-DGCR8 complex in primary microRNA processing. , 2004, Genes & development.

[9]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[10]  Sebastian Kadener,et al.  Genome-wide identification of targets of the drosha-pasha/DGCR8 complex. , 2009, RNA.

[11]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[12]  E. Lai,et al.  The Mirtron Pathway Generates microRNA-Class Regulatory RNAs in Drosophila , 2007, Cell.

[13]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[14]  M. Huynen,et al.  Assessing the reliability of RNA folding using statistical mechanics. , 1997, Journal of molecular biology.

[15]  Tongbin Li,et al.  Drosha processing controls the specificity and efficiency of global microRNA expression. , 2011, Biochimica et biophysica acta.