LMTRDA: Using logistic model tree to predict MiRNA-disease associations by fusing multi-source information of sequences and similarities

Emerging evidence has shown microRNAs (miRNAs) play an important role in human disease research. Identifying potential association among them is significant for the development of pathology, diagnose and therapy. However, only a tiny portion of all miRNA-disease pairs in the current datasets are experimentally validated. This prompts the development of high-precision computational methods to predict real interaction pairs. In this paper, we propose a new model of Logistic Model Tree for predicting miRNA-Disease Association (LMTRDA) by fusing multi-source information including miRNA sequences, miRNA functional similarity, disease semantic similarity, and known miRNA-disease associations. In particular, we introduce miRNA sequence information and extract its features using natural language processing technique for the first time in the miRNA-disease prediction model. In the cross-validation experiment, LMTRDA obtained 90.51% prediction accuracy with 92.55% sensitivity at the AUC of 90.54% on the HMDD V3.0 dataset. To further evaluate the performance of LMTRDA, we compared it with different classifier and feature descriptor models. In addition, we also validate the predictive ability of LMTRDA in human diseases including Breast Neoplasms, Breast Neoplasms and Lymphoma. As a result, 28, 27 and 26 out of the top 30 miRNAs associated with these diseases were verified by experiments in different kinds of case studies. These experimental results demonstrate that LMTRDA is a reliable model for predicting the association among miRNAs and diseases.

[1]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[2]  Yadong Wang,et al.  miR2Disease: a manually curated database for microRNA deregulation in human disease , 2008, Nucleic Acids Res..

[3]  Sunghwan Sohn,et al.  Mining peripheral arterial disease cases from narrative clinical notes using natural language processing , 2017, Journal of vascular surgery.

[4]  Hong-Bin Shen,et al.  Learning distributed representations of RNA sequences and its application for predicting RNA-protein binding sites with a convolutional neural network , 2018, Neurocomputing.

[5]  C. Sander,et al.  Identification of microRNAs of the herpesvirus family , 2005, Nature Methods.

[6]  K. Zhang,et al.  MiR-767 promoted cell proliferation in human melanoma by suppressing CYLD expression. , 2018, Gene.

[7]  D. Baltimore,et al.  NF-κB-dependent induction of microRNA miR-146, an inhibitor targeted to signaling proteins of innate immune responses , 2006, Proceedings of the National Academy of Sciences.

[8]  V. Ambros The functions of animal microRNAs , 2004, Nature.

[9]  Xiangxiang Zeng,et al.  Inferring MicroRNA-Disease Associations by Random Walk on a Heterogeneous Network with Multiple Data Sources , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[10]  Jan Gorodkin,et al.  Protein-driven inference of miRNA–disease associations , 2013, Bioinform..

[11]  E. Miska,et al.  How microRNAs control cell division, differentiation and death. , 2005, Current opinion in genetics & development.

[12]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[13]  Thomas Thum,et al.  Cardiovascular Importance of the MicroRNA‐23/27/24 Family , 2012, Microcirculation.

[14]  Yang Li,et al.  HMDD v2.0: a database for experimentally supported human microRNA and disease associations , 2013, Nucleic Acids Res..

[15]  Yufei Huang,et al.  Prediction of microRNAs Associated with Human Diseases Based on Weighted k Most Similar Neighbors , 2013, PloS one.

[16]  Yong Zhou,et al.  Using Two-dimensional Principal Component Analysis and Rotation Forest for Prediction of Protein-Protein Interactions , 2018, Scientific Reports.

[17]  Peizhang Xu,et al.  MicroRNAs and the regulation of cell death. , 2004, Trends in genetics : TIG.

[18]  Xia Li,et al.  Prediction of potential disease-associated microRNAs based on random walk , 2015, Bioinform..

[19]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[20]  Yong Zhou,et al.  An improved efficient rotation forest algorithm to predict the interactions among proteins , 2018, Soft Comput..

[21]  Xing Chen,et al.  PBMDA: A novel and effective path-based computational model for miRNA-disease association prediction , 2017, PLoS Comput. Biol..

[22]  Zhu-Hong You,et al.  An ensemble approach for large-scale identification of protein-protein interactions using the alignments of multiple sequences , 2016, Oncotarget.

[23]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[24]  V. Ambros MicroRNA Pathways in Flies and Worms Growth, Death, Fat, Stress, and Timing , 2003, Cell.

[25]  Juan Xu,et al.  Prioritizing Candidate Disease miRNAs by Topological Features in the miRNA Target–Dysregulated Network: Case Study of Prostate Cancer , 2011, Molecular Cancer Therapeutics.

[26]  Xiangxiang Zeng,et al.  Integrative approaches for predicting microRNA function and prioritizing disease-related microRNA using biological interaction networks , 2016, Briefings Bioinform..

[27]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[28]  Yadong Wang,et al.  Prioritization of disease microRNAs through a human phenome-microRNAome network , 2010, BMC Systems Biology.

[29]  Qionghai Dai,et al.  WBSMDA: Within and Between Score for MiRNA-Disease Association prediction , 2016, Scientific Reports.

[30]  Xing Chen,et al.  LRSSLMDA: Laplacian Regularized Sparse Subspace Learning for MiRNA-Disease Association prediction , 2017, PLoS Comput. Biol..

[31]  Qiang Yang,et al.  SVM: Support Vector Machines , 2011 .

[32]  Dong Wang,et al.  Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases , 2010, Bioinform..

[33]  Yongqun He,et al.  A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks , 2013, BMC Systems Biology.

[34]  Jin-qiao Zhang,et al.  An integrated bioinformatical analysis of miR-19a target genes in multiple myeloma , 2017, Experimental and therapeutic medicine.

[35]  Q. Zou,et al.  Similarity computation strategies in the microRNA-disease network: a survey. , 2015, Briefings in functional genomics.

[36]  Robert A. Weinberg,et al.  Tumour invasion and metastasis initiated by microRNA-10b in breast cancer (Nature (2007) 449, (682-688)) , 2008 .

[37]  V. Ambros,et al.  The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14 , 1993, Cell.

[38]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[39]  Q. Cui,et al.  Principles of microRNA regulation of a human cellular signaling network , 2006, Molecular systems biology.

[40]  Peng Gao,et al.  Deregulation of microRNA expression occurs early and accumulates in early stages of HBV-associated multistep hepatocarcinogenesis. , 2011, Journal of hepatology.

[41]  Yong Zhou,et al.  Advancing the prediction accuracy of protein-protein interactions by utilizing evolutionary information from position-specific scoring matrix and ensemble classifier. , 2017, Journal of theoretical biology.

[42]  Shiliang Sun,et al.  A review of natural language processing techniques for opinion mining systems , 2017, Inf. Fusion.

[43]  Eibe Frank,et al.  Speeding Up Logistic Model Tree Induction , 2005, PKDD.

[44]  V. Ambros microRNAs Tiny Regulators with Great Potential , 2001, Cell.

[45]  Xing Chen,et al.  RWRMDA: predicting novel human microRNA-disease associations. , 2012, Molecular bioSystems.

[46]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[47]  B. Reinhart,et al.  The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans , 2000, Nature.