TargetMiner: microRNA target prediction with systematic identification of tissue-specific negative examples

MOTIVATION Prediction of microRNA (miRNA) target mRNAs using machine learning approaches is an important area of research. However, most of the methods suffer from either high false positive or false negative rates. One reason for this is the marked deficiency of negative examples or miRNA non-target pairs. Systematic identification of non-target mRNAs is still not addressed properly, and therefore, current machine learning approaches are compelled to rely on artificially generated negative examples for training. RESULTS In this article, we have identified approximately 300 tissue-specific negative examples using a novel approach that involves expression profiling of both miRNAs and mRNAs, miRNA-mRNA structural interactions and seed-site conservation. The newly generated negative examples are validated with pSILAC dataset, which elucidate the fact that the identified non-targets are indeed non-targets.These high-throughput tissue-specific negative examples and a set of experimentally verified positive examples are then used to build a system called TargetMiner, a support vector machine (SVM)-based classifier. In addition to assessing the prediction accuracy on cross-validation experiments, TargetMiner has been validated with a completely independent experimental test dataset. Our method outperforms 10 existing target prediction algorithms and provides a good balance between sensitivity and specificity that is not reflected in the existing methods. We achieve a significantly higher sensitivity and specificity of 69% and 67.8% based on a pool of 90 feature set and 76.5% and 66.1% using a set of 30 selected feature set on the completely independent test dataset. In order to establish the effectiveness of the systematically generated negative examples, the SVM is trained using a different set of negative data generated using the method in Yousef et al. A significantly higher false positive rate (70.6%) is observed when tested on the independent set, while all other factors are kept the same. Again, when an existing method (NBmiRTar) is executed with the our proposed negative data, we observe an improvement in its performance. These clearly establish the effectiveness of the proposed approach of selecting the negative examples systematically. AVAILABILITY TargetMiner is now available as an online tool at www.isical.ac.in/ approximately bioinfo_miu

[1]  N. Rajewsky,et al.  Widespread changes in protein synthesis induced by microRNAs , 2008, Nature.

[2]  A. Hatzigeorgiou,et al.  A combined computational-experimental approach predicts human microRNA targets. , 2004, Genes & development.

[3]  Doron Betel,et al.  The microRNA.org resource: targets and expression , 2007, Nucleic Acids Res..

[4]  Tongbin Li,et al.  miRecords: an integrated resource for microRNA–target interactions , 2008, Nucleic Acids Res..

[5]  Anton J. Enright,et al.  Human MicroRNA Targets , 2004, PLoS biology.

[6]  Y. Li,et al.  Incorporating structure to predict microRNA targets. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Xiaowei Wang,et al.  Sequence analysis Prediction of both conserved and nonconserved microRNA targets in animals , 2007 .

[8]  J. Castle,et al.  Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays , 2003, Science.

[9]  Louise C. Showe,et al.  Naïve Bayes for microRNA target predictions - machine learning for microRNA targets , 2007, Bioinform..

[10]  Michael Kertesz,et al.  The role of site accessibility in microRNA target recognition , 2007, Nature Genetics.

[11]  C. Burge,et al.  Conserved Seed Pairing, Often Flanked by Adenosines, Indicates that Thousands of Human Genes are MicroRNA Targets , 2005, Cell.

[12]  C. Burge,et al.  Most mammalian mRNAs are conserved targets of microRNAs. , 2008, Genome research.

[13]  A. Bradley,et al.  Identification of mammalian microRNA host genes and transcription units. , 2004, Genome research.

[14]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[15]  Xiaowei Wang,et al.  Systematic identification of microRNA functions by combining target prediction and expression profiling , 2006, Nucleic acids research.

[16]  Lukasz A. Kurgan,et al.  HuMiTar: A sequence-based method for prediction of human microRNA targets , 2008, Algorithms for Molecular Biology.

[17]  Martin Reczko,et al.  The database of experimentally supported targets: a functional update of TarBase , 2008, Nucleic Acids Res..

[18]  S. Batalov,et al.  A gene atlas of the mouse and human protein-encoding transcriptomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[19]  K. Kadota,et al.  Detection of genes with tissue-specific expression patterns using Akaike's information criterion procedure. , 2003, Physiological genomics.

[20]  J. Castle,et al.  Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs , 2005, Nature.

[21]  Ola Snøve,et al.  Weighted sequence motifs as an improved seeding step in microRNA target prediction algorithms. , 2005, RNA.

[22]  C. Burge,et al.  Prediction of Mammalian MicroRNA Targets , 2003, Cell.

[23]  L. Lim,et al.  MicroRNA targeting specificity in mammals: determinants beyond seed pairing. , 2007, Molecular cell.

[24]  D. Bartel,et al.  Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. , 2005, RNA.

[25]  K. Gunsalus,et al.  Combinatorial microRNA target predictions , 2005, Nature Genetics.