Effective classification of microRNA precursors using feature mining and AdaBoost algorithms.

MicroRNAs play important roles in most biological processes, including cell proliferation, tissue differentiation, and embryonic development, among others. They originate from precursor transcripts (pre-miRNAs), which contain phylogenetically conserved stem-loop structures. An important bioinformatics problem is to distinguish the pre-miRNAs from pseudo pre-miRNAs that have similar stem-loop structures. We present here a novel method for tackling this bioinformatics problem. Our method, named MirID, accepts an RNA sequence as input, and classifies the RNA sequence either as positive (i.e., a real pre-miRNA) or as negative (i.e., a pseudo pre-miRNA). MirID employs a feature mining algorithm for finding combinations of features suitable for building pre-miRNA classification models. These models are implemented using support vector machines, which are combined to construct a classifier ensemble. The accuracy of the classifier ensemble is further enhanced by the utilization of an AdaBoost algorithm. When compared with two closely related tools on twelve species analyzed with these tools, MirID outperforms the existing tools on the majority of the twelve species. MirID was also tested on nine additional species, and the results showed high accuracies on the nine species. The MirID web server is fully operational and freely accessible at http://bioinformatics.njit.edu/MirID/ . Potential applications of this software in genomics and medicine are also discussed.

[1]  Q. Cui,et al.  An Analysis of Human MicroRNA and Disease Associations , 2008, PloS one.

[2]  Jason Tsong-Li Wang,et al.  Kernel design for RNA classification using Support Vector Machines , 2006, Int. J. Data Min. Bioinform..

[3]  Daniel B. Martin,et al.  Circulating microRNAs as stable blood-based markers for cancer detection , 2008, Proceedings of the National Academy of Sciences.

[4]  M. Levine,et al.  miRTRAP, a computational method for the systematic identification of miRNAs from high throughput sequencing data , 2010, Genome Biology.

[5]  Jun Yu,et al.  PMirP: A pre-microRNA prediction method based on structure-sequence hybrid features , 2010, Artif. Intell. Medicine.

[6]  G. Rubin,et al.  Computational identification of Drosophila microRNA genes , 2003, Genome Biology.

[7]  C. Nelson,et al.  miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data , 2012, Nucleic acids research.

[8]  S. Cohen,et al.  microRNA functions. , 2007, Annual review of cell and developmental biology.

[9]  Alessandra Carbone,et al.  MIReNA: finding microRNAs with high accuracy and no learning at genome scale and from deep sequencing data , 2010, Bioinform..

[10]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[11]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[12]  T. Visakorpi,et al.  Diagnostic and prognostic signatures from the small non-coding RNA transcriptome in prostate cancer , 2012, Oncogene.

[13]  S. Gottesman Micros for microbes: non-coding regulatory RNAs in bacteria. , 2005, Trends in genetics : TIG.

[14]  W. Tian,et al.  Expression of miR-31, miR-125b-5p, and miR-326 in the adipogenic differentiation process of adipose-derived stem cells. , 2009, Omics : a journal of integrative biology.

[15]  Mong-Li Lee,et al.  Exploring Essential Attributes for Detecting MicroRNA Precursors from Background Sequences , 2006, VDMB.

[16]  Matthew W. Anderson,et al.  Next Generation DNA Sequencing and the Future of Genomic Medicine , 2010, Genes.

[17]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[18]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[19]  R. Aharonov,et al.  Identification of hundreds of conserved and nonconserved human microRNAs , 2005, Nature Genetics.

[20]  F. Kuchenbauer,et al.  Circulating microRNAs as biomarkers - True Blood? , 2011, Genome Medicine.

[21]  Robert E. Schapire,et al.  A Brief Introduction to Boosting , 1999, IJCAI.

[22]  Patricia Soteropoulos,et al.  MicroRNA let-7a down-regulates MYC and reverts MYC-induced growth in Burkitt lymphoma cells. , 2007, Cancer research.

[23]  V. Ambros,et al.  The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14 , 1993, Cell.

[24]  Ana M. Aransay,et al.  miRanalyzer: an update on the detection and analysis of microRNAs in high-throughput sequencing experiments , 2011, Nucleic Acids Res..

[25]  Mihaela Zavolan,et al.  Identification of Clustered Micrornas Using an Ab Initio Prediction Method , 2022 .

[26]  E. Wentzel,et al.  miR-21: an androgen receptor-regulated microRNA that promotes hormone-dependent and hormone-independent prostate cancer growth. , 2009, Cancer research.

[27]  Oliver Hobert,et al.  A microRNA controlling left/right neuronal asymmetry in Caenorhabditis elegans , 2003, Nature.

[28]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[29]  Fei Li,et al.  Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine , 2005, BMC Bioinformatics.

[30]  Shingo Takagi,et al.  MicroRNA regulates the expression of human cytochrome P450 1B1. , 2006, Cancer research.

[31]  G. Mack MicroRNA gets down to business , 2007, Nature Biotechnology.

[32]  Xiuping Liu,et al.  Role of MicroRNA miR-27a and miR-451 in the regulation of MDR1/P-glycoprotein expression in human cancer cells. , 2008, Biochemical pharmacology.

[33]  Ivo L. Hofacker,et al.  Vienna RNA secondary structure server , 2003, Nucleic Acids Res..

[34]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[35]  Bin Fan,et al.  MiRFinder: an improved approach and software implementation for genome-wide fast microRNA precursor scans , 2007, BMC Bioinformatics.

[36]  Santosh K. Mishra,et al.  De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures , 2007, Bioinform..

[37]  Yuzhuo Pan,et al.  MicroRNAs Regulate CYP3A4 Expression via Direct and Indirect Targeting The online version of this article (available at http://dmd.aspetjournals.org) contains supplemental material. , 2009, Drug Metabolism and Disposition.

[38]  H. Lipkin Where is the ?c? , 1978 .

[39]  B. Shapiro,et al.  RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers. , 2006, RNA.

[40]  Hajime Sakai,et al.  Regulation of Flowering Time and Floral Organ Identity by a MicroRNA and Its APETALA2-Like Target Genes Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.016238. , 2003, The Plant Cell Online.

[41]  You Li,et al.  Regulation of hepatic microRNA expression in response to ischemic preconditioning following ischemia/reperfusion injury in mice. , 2009, Omics : a journal of integrative biology.

[42]  Ana Kozomara,et al.  miRBase: integrating microRNA annotation and deep-sequencing data , 2010, Nucleic Acids Res..

[43]  Jason Tsong-Li Wang,et al.  In silico prediction of noncoding RNAs using supervised learning and feature ranking methods , 2011, Int. J. Bioinform. Res. Appl..

[44]  Anton J. Enright,et al.  Human MicroRNA Targets , 2004, PLoS biology.

[45]  Patricia Soteropoulos,et al.  A micro‐RNA signature associated with race, tumor size, and target gene activity in human uterine leiomyomas , 2007, Genes, chromosomes & cancer.

[46]  Jason Tsong-Li Wang,et al.  Bioinformatics Methods for Studying MicroRNA and ARE-Mediated Regulation of Post-Transcriptional Gene Expression , 2010, Int. J. Knowl. Discov. Bioinform..

[47]  M. Gleave,et al.  MicroRNAs Associated with Metastatic Prostate Cancer , 2011, PloS one.

[48]  Sebastian D. Mackowiak,et al.  miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades , 2011, Nucleic acids research.

[49]  Kaizhong Zhang,et al.  Predicting Consensus Structures for RNA Alignments Via Pseudo-Energy Minimization , 2009, Bioinformatics and biology insights.

[50]  Jun Hu,et al.  A method for aligning RNA secondary structures and its application to RNA motif detection , 2005, BMC Bioinformatics.

[51]  J. Welsh,et al.  Effects of 1α,25 dihydroxyvitamin D3 and testosterone on miRNA and mRNA expression in LNCaP cells , 2011, Molecular Cancer.

[52]  G. Storz,et al.  Target prediction for small, noncoding RNAs in bacteria , 2006, Nucleic acids research.

[53]  Jason Tsong-Li Wang,et al.  Pre-miRNA classification via combinatorial feature mining and boosting , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine.

[54]  M. Ingelman-Sundberg,et al.  Epigenetic and microRNA-dependent control of cytochrome P450 expression: a gap between DNA and protein. , 2009, Pharmacogenomics.

[55]  R. Russell,et al.  bantam Encodes a Developmentally Regulated microRNA that Controls Cell Proliferation and Regulates the Proapoptotic Gene hid in Drosophila , 2003, Cell.

[56]  S. Leivonen,et al.  Systematic analysis of microRNAs targeting the androgen receptor in prostate cancer cells. , 2011, Cancer research.

[57]  Shingo Takagi,et al.  Post-transcriptional Regulation of Human Pregnane X Receptor by Micro-RNA Affects the Expression of Cytochrome P450 3A4* , 2008, Journal of Biological Chemistry.

[58]  C. Burge,et al.  Vertebrate MicroRNA Genes , 2003, Science.