miRgo: integrating various off-the-shelf tools for identification of microRNA–target interactions by heterogeneous features and a novel evaluation indicator

MicroRNAs (miRNAs) are short non-coding RNAs that regulate gene expression and biological processes through binding to messenger RNAs. Predicting the relationship between miRNAs and their targets is crucial for research and clinical applications. Many tools have been developed to predict miRNA–target interactions, but variable results among the different prediction tools have caused confusion for users. To solve this problem, we developed miRgo, an application that integrates many of these tools. To train the prediction model, extreme values and median values from four different data combinations, which were obtained via an energy distribution function, were used to find the most representative dataset. Support vector machines were used to integrate 11 prediction tools, and numerous feature types used in these tools were classified into six categories—binding energy, scoring function, evolution evidence, binding type, sequence property, and structure—to simplify feature selection. In addition, a novel evaluation indicator, the Chu-Hsieh-Liang (CHL) index, was developed to improve the prediction power in positive data for feature selection. miRgo achieved better results than all other prediction tools in evaluation by an independent testing set and by its subset of functionally important genes. The tool is available at http://predictor.nchu.edu.tw/miRgo .

[1]  Chi-Wei Chen,et al.  iStable: off-the-shelf predictor integration for predicting protein stability changes , 2013, BMC Bioinformatics.

[2]  Jun Ding,et al.  TarPmiR: a new approach for microRNA target site prediction , 2016, Bioinform..

[3]  D. Bartel MicroRNAs: Target Recognition and Regulatory Functions , 2009, Cell.

[4]  C. Burge,et al.  Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets , 2005, Cell.

[5]  Peter F. Stadler,et al.  ViennaRNA Package 2.0 , 2011, Algorithms for Molecular Biology.

[6]  Jirí Vanícek,et al.  PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3′ UTRs and coding sequences , 2015, Nucleic Acids Res..

[7]  Francisco J. Valverde-Albacete,et al.  100% Classification Accuracy Considered Harmful: The Normalized Information Transfer Factor Explains the Accuracy Paradox , 2014, PloS one.

[8]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[9]  Hsien-Da Huang,et al.  miRTarBase update 2018: a resource for experimentally validated microRNA-target interactions , 2017, Nucleic Acids Res..

[10]  C. Bracken,et al.  Experimental strategies for microRNA target identification , 2011, Nucleic acids research.

[11]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[12]  The Gene Ontology Consortium,et al.  The Gene Ontology Resource: 20 years and still GOing strong , 2018, Nucleic Acids Res..

[13]  Yvonne Tay,et al.  A Pattern-Based Method for the Identification of MicroRNA Binding Sites and Their Corresponding Heteroduplexes , 2006, Cell.

[14]  Yang Liu,et al.  MiRTDL: A Deep Learning Approach for miRNA Target Prediction , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  Zhongming Zhao,et al.  MBSTAR: multiple instance learning for predicting specific functional binding sites in microRNA targets , 2015, Scientific Reports.

[16]  Quan Zou,et al.  Computational Analysis of miRNA Target Identification , 2012 .

[17]  V. Ambros,et al.  The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14 , 1993, Cell.

[18]  D. Bartel,et al.  Weak Seed-Pairing Stability and High Target-Site Abundance Decrease the Proficiency of lsy-6 and Other miRNAs , 2011, Nature Structural &Molecular Biology.

[19]  Ian H. Witten,et al.  Data mining in bioinformatics using Weka , 2004, Bioinform..

[20]  Tongbin Li,et al.  Meta-prediction of protein subcellular localization with reduced voting , 2007, Nucleic acids research.

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Jan Krüger,et al.  RNAhybrid: microRNA target prediction easy, fast and flexible , 2006, Nucleic Acids Res..

[23]  V. Ambros,et al.  Role of MicroRNAs in Plant and Animal Development , 2003, Science.

[24]  Michael Kertesz,et al.  The role of site accessibility in microRNA target recognition , 2007, Nature Genetics.

[25]  Ivo Grosse,et al.  Functional microRNA targets in protein coding sequences , 2012, Bioinform..

[26]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Ana Kozomara,et al.  miRBase: from microRNA sequences to function , 2018, Nucleic Acids Res..

[28]  Huan Liu,et al.  Incremental Feature Selection , 1998, Applied Intelligence.

[29]  Athanasios Fevgas,et al.  DIANA-TarBase v7.0: indexing more than half a million experimentally supported miRNA:mRNA interactions , 2014, Nucleic Acids Res..

[30]  James C. Hu,et al.  The Gene Ontology Resource: 20 years and still GOing strong , 2019 .

[31]  Fons J. Verbeek,et al.  Comparison and Integration of Target Prediction Algorithms for microRNA Studies , 2010, J. Integr. Bioinform..

[32]  William A. Rennie,et al.  STarMirDB: A database of microRNA binding sites , 2016, RNA biology.

[33]  E. Sontheimer,et al.  Origins and Mechanisms of miRNAs and siRNAs , 2009, Cell.

[34]  Anjali J. Koppal,et al.  Supplementary data: Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites , 2010 .

[35]  Joana A. Vidigal,et al.  The biological functions of miRNAs: lessons from in vivo studies. , 2015, Trends in cell biology.

[36]  Bin Xue,et al.  Integrated Strategy Improves the Prediction Accuracy of miRNA in Large Dataset , 2016, PloS one.

[37]  Xingming Zhao,et al.  Predicting protein–protein interactions from protein sequences using meta predictor , 2010, Amino Acids.

[38]  Paul D Thomas,et al.  The Gene Ontology and the Meaning of Biological Function. , 2017, Methods in molecular biology.

[39]  D. Bartel,et al.  Predicting effective microRNA target sites in mammalian mRNAs , 2015, eLife.

[40]  Xiaowei Wang,et al.  miRDB: an online resource for microRNA target prediction and functional annotations , 2014, Nucleic Acids Res..

[41]  K. Gunsalus,et al.  Combinatorial microRNA target predictions , 2005, Nature Genetics.

[42]  Xingquan Zhu,et al.  Knowledge Discovery and Data Mining: Challenges and Realities , 2007 .

[43]  E. Lund,et al.  Substrate selectivity of exportin 5 and Dicer in the biogenesis of microRNAs. , 2006, Cold Spring Harbor symposia on quantitative biology.