Ensemble learning can significantly improve human microRNA target prediction.

MicroRNAs (miRNAs) regulate the function of their target genes by down-regulating gene expression, participating in various biological processes. Since the discovery of the first miRNA, computational tools have been essential to predict targets of given miRNAs that can be biologically verified. The precise mechanism underlying miRNA-mRNA interaction has not yet been elucidated completely, and it is still difficult to predict miRNA targets computationally in a robust fashion, despite the large number of in silico prediction methodologies in existence. Because of this limitation, different target prediction tools often report different and (occasionally conflicting) sets of targets. Therefore, we propose a novel target prediction methodology called stacking-based miRNA interaction learner ensemble (SMILE) that employs the concept of stacked generalization (stacking), which is a type of ensemble learning that integrates the outcomes of individual prediction tools with the aim of surpassing the performance of the individual tools. We tested the proposed SMILE method on human miRNA-mRNA interaction data derived from public databases. In our experiments, SMILE improved the accuracy of the target prediction significantly in terms of the area under the receiver operating characteristic curve. Any new target prediction tool can easily be incorporated into the proposed methodology as a component learner, and we anticipate that SMILE will provide a flexible and effective framework for elucidating in vivo miRNA-mRNA interaction.

[1]  Chi-Ying F. Huang,et al.  miRTarBase: a database curates experimentally validated microRNA–target interactions , 2010, Nucleic Acids Res..

[2]  Curtis Balch,et al.  MicroRNA and mRNA integrated analysis (MMIA): a web tool for examining biological functions of microRNA expression , 2009, Nucleic Acids Res..

[3]  Hyeyoung Min,et al.  Got target?: computational methods for microRNA target prediction and their extension , 2010, Experimental & Molecular Medicine.

[4]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[6]  Ola Snøve,et al.  Weighted sequence motifs as an improved seeding step in microRNA target prediction algorithms. , 2005, RNA.

[7]  A. Hatzigeorgiou,et al.  A guide through present computational approaches for the identification of mammalian microRNA targets , 2006, Nature Methods.

[8]  H. Horvitz,et al.  MicroRNA expression profiles classify human cancers , 2005, Nature.

[9]  Stijn van Dongen,et al.  miRBase: microRNA sequences, targets and gene nomenclature , 2005, Nucleic Acids Res..

[10]  Nectarios Koziris,et al.  Accurate microRNA target prediction correlates with protein repression levels , 2009, BMC Bioinformatics.

[11]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[12]  K. Gunsalus,et al.  Combinatorial microRNA target predictions , 2005, Nature Genetics.

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  C. Croce,et al.  MicroRNA signatures in human cancers , 2006, Nature Reviews Cancer.

[15]  Louise C. Showe,et al.  Naïve Bayes for microRNA target predictions - machine learning for microRNA targets , 2007, Bioinform..

[16]  Ioannis P. Vlahavas,et al.  StackTIS: A stacked generalization approach for effective prediction of translation initiation sites , 2012, Comput. Biol. Medicine.

[17]  Michael Kertesz,et al.  The role of site accessibility in microRNA target recognition , 2007, Nature Genetics.

[18]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[19]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[20]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[21]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[22]  Patrick Xuechun Zhao,et al.  Computational analysis of miRNA targets in plants: current status and challenges , 2011, Briefings Bioinform..

[23]  Enrico Macii,et al.  miREE: miRNA recognition elements ensemble , 2011, BMC Bioinformatics.

[24]  C. Burge,et al.  Prediction of Mammalian MicroRNA Targets , 2003, Cell.

[25]  Kristin C. Gunsalus,et al.  microRNA Target Predictions across Seven Drosophila Species and Comparison to Mammalian Targets , 2005, PLoS Comput. Biol..

[26]  Kagan Tumer,et al.  Error Correlation and Error Reduction in Ensemble Classifiers , 1996, Connect. Sci..

[27]  N. Rajewsky microRNA target predictions in animals , 2006, Nature Genetics.

[28]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[29]  C. Burge,et al.  Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets , 2005, Cell.

[30]  R. Giegerich,et al.  Fast and effective prediction of microRNA/target duplexes. , 2004, RNA.

[31]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[32]  Tongbin Li,et al.  miRecords: an integrated resource for microRNA–target interactions , 2008, Nucleic Acids Res..

[33]  Dmitrij Frishman,et al.  TargetSpy: a supervised machine learning approach for microRNA target prediction , 2010, BMC Bioinformatics.

[34]  Xiaowei Wang,et al.  Sequence analysis Prediction of both conserved and nonconserved microRNA targets in animals , 2007 .

[35]  Byoung-Tak Zhang,et al.  miTarget: microRNA target gene prediction using a support vector machine , 2006, BMC Bioinformatics.

[36]  G. De Micheli,et al.  Computational identification of microRNAs and their targets. , 2006, Birth defects research. Part C, Embryo today : reviews.

[37]  Timos K. Sellis,et al.  miRGen 2.0: a database of microRNA genomic information and regulation , 2009, Nucleic Acids Res..

[38]  Sanghamitra Bandyopadhyay,et al.  TargetMiner: microRNA target prediction with systematic identification of tissue-specific negative examples , 2009, Bioinform..

[39]  Michal Linial,et al.  MiRror: a combinatorial analysis web tool for ensembles of microRNAs and their targets , 2010, Bioinform..

[40]  Sanghyuk Lee,et al.  miRGator v2.0 : an integrated system for functional investigation of microRNAs , 2010, Nucleic Acids Res..

[41]  Gabriele Sales,et al.  MAGIA2: from miRNA and genes expression data integrative analysis to microRNA–transcription factor mixed regulatory circuits (2012 update) , 2012, Nucleic Acids Res..