Prediction of pre-miRNA with multiple stem-loops using pruning algorithm

In addition to experimental identification of pre-miRNAs, the computational prediction method is also becoming a hot research spot. Most existing prediction methods are usually excluding those pre-miRNAs with multiple loops. But as more and more miRNA have been identified, quite a number of miRNA precursor with multiple loops have been found. Therefore, determining how to effectively identify pre-miRNAs with multiple loops from the control dataset with multiple loops is an imperative problem. In this work, a pruning algorithm is presented to identify the main branch from the multiple stem-loops of pre-miRNA. A stack algorithm is employed to describe the secondary structure of pre-miRNA in four different patterns, and a recursive algorithm is employed to split the multiple stem-loops of pre-miRNA into several small branches, and to identify its main branch. Statistic results indicate that the information of the main branch can be represented as the whole sequence of pre-miRNA. Some features of main branch are extracted to describe pre-miRNA intrinsic features, and SVM classifier is implemented to recognize real pre-miRNA with multiple stem-loops. Based on training and testing on dataset from miRBase12.0, SVM classifier achieves sensitivity of 75.76% on RM-POS and specificity of 98.12% on RM-CDS, and specificity of 91.28% on RM-NCR. The obtained results indicated that the information of main branch after pruning can represent intrinsic features of pre-miRNA with multiple stem-loops. The proposed method in this work provides a powerful predicting method to recognize the real pre-miRNA with multiple stem-loops.

[1]  C. Burge,et al.  The microRNAs of Caenorhabditis elegans. , 2003, Genes & development.

[2]  Stijn van Dongen,et al.  miRBase: microRNA sequences, targets and gene nomenclature , 2005, Nucleic Acids Res..

[3]  J. Sabina,et al.  Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. , 1999, Journal of molecular biology.

[4]  Ming-Wei Chang,et al.  Leave-One-Out Bounds for Support Vector Regression Model Selection , 2005, Neural Computation.

[5]  Shay Artzi,et al.  miRNAminer: A tool for homologous microRNA gene search , 2008, BMC Bioinformatics.

[6]  Peter F. Stadler,et al.  Hairpins in a Haystack: recognizing microRNA precursors in comparative genomics data , 2006, ISMB.

[7]  B. Cullen,et al.  Exportin-5 mediates the nuclear export of pre-microRNAs and short hairpin RNAs. , 2003, Genes & development.

[8]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[9]  G. Hannon,et al.  Processing of primary microRNAs by the Microprocessor complex , 2004, Nature.

[10]  R. Shiekhattar,et al.  The Microprocessor complex mediates the genesis of microRNAs , 2004, Nature.

[11]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[12]  G. Rubin,et al.  Computational identification of Drosophila microRNA genes , 2003, Genome Biology.

[13]  Louise C. Showe,et al.  Bioinformatics Original Paper Combining Multi-species Genomic Data for Microrna Identification Using a Naı¨ve Bayes Classifier , 2022 .

[14]  Fei Li,et al.  Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine , 2005, BMC Bioinformatics.

[15]  Peng Jiang,et al.  MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features , 2007, Nucleic Acids Res..

[16]  V. Ambros,et al.  The lin-4 regulatory RNA controls developmental timing in Caenorhabditis elegans by blocking LIN-14 protein synthesis after the initiation of translation. , 1999, Developmental biology.

[17]  G. Ruvkun,et al.  A uniform system for microRNA annotation. , 2003, RNA.

[18]  G. Hannon,et al.  C . elegans involved in developmental timing in Dicer functions in RNA interference and in synthesis of small RNA , 2001 .

[19]  Weixiong Zhang,et al.  MicroRNA prediction with a novel ranking algorithm based on random walks , 2008, ISMB.

[20]  V. Kim,et al.  The nuclear RNase III Drosha initiates microRNA processing , 2003, Nature.

[21]  Sam Griffiths-Jones,et al.  The microRNA Registry , 2004, Nucleic Acids Res..

[22]  A. Pasquinelli,et al.  Genes and Mechanisms Related to RNA Interference Regulate Expression of the Small Temporal RNAs that Control C. elegans Developmental Timing , 2001, Cell.

[23]  V. Ambros,et al.  An Extensive Class of Small RNAs in Caenorhabditis elegans , 2001, Science.

[24]  C. Burge,et al.  Vertebrate MicroRNA Genes , 2003, Science.

[25]  Santosh K. Mishra,et al.  De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures , 2007, Bioinform..

[26]  U. Kutay,et al.  Nuclear Export of MicroRNA Precursors , 2004, Science.

[27]  Robert D. Finn,et al.  Rfam: updates to the RNA families database , 2008, Nucleic Acids Res..

[28]  Xiaobai Zhang,et al.  Characteristic comparison between two types of miRNA precursors in metazoan species , 2010, Biosyst..

[29]  A. Pasquinelli,et al.  A Cellular Function for the RNA-Interference Enzyme Dicer in the Maturation of the let-7 Small Temporal RNA , 2001, Science.

[30]  Fei Li,et al.  MicroRNA identification based on sequence and structure alignment , 2005, Bioinform..