MLMDA: a machine learning approach to predict and validate MicroRNA–disease associations by integrating of heterogenous information sources

BackgroundEmerging evidences show that microRNA (miRNA) plays an important role in many human complex diseases. However, considering the inherent time-consuming and expensive of traditional in vitro experiments, more and more attention has been paid to the development of efficient and feasible computational methods to predict the potential associations between miRNA and disease.MethodsIn this work, we present a machine learning-based model called MLMDA for predicting the association of miRNAs and diseases. More specifically, we first use the k-mer sparse matrix to extract miRNA sequence information, and combine it with miRNA functional similarity, disease semantic similarity and Gaussian interaction profile kernel similarity information. Then, more representative features are extracted from them through deep auto-encoder neural network (AE). Finally, the random forest classifier is used to effectively predict potential miRNA–disease associations.ResultsThe experimental results show that the MLMDA model achieves promising performance under fivefold cross validations with AUC values of 0.9172, which is higher than the methods using different classifiers or different feature combination methods mentioned in this paper. In addition, to further evaluate the prediction performance of MLMDA model, case studies are carried out with three Human complex diseases including Lymphoma, Lung Neoplasm, and Esophageal Neoplasms. As a result, 39, 37 and 36 out of the top 40 predicted miRNAs are confirmed by other miRNA–disease association databases.ConclusionsThese prominent experimental results suggest that the MLMDA model could serve as a useful tool guiding the future experimental validation for those promising miRNA biomarker candidates. The source code and datasets explored in this work are available at http://220.171.34.3:81/.

[1]  Jacques Ferlay,et al.  Trends in oesophageal cancer incidence and mortality in Europe , 2007, International journal of cancer.

[2]  Q. Cui,et al.  An Analysis of Human MicroRNA and Disease Associations , 2008, PloS one.

[3]  Xing Chen,et al.  MDHGI: Matrix Decomposition and Heterogeneous Graph Inference for miRNA-disease association prediction , 2018, PLoS Comput. Biol..

[4]  Xing Chen,et al.  LMTRDA: Using logistic model tree to predict MiRNA-disease associations by fusing multi-source information of sequences and similarities , 2019, PLoS Comput. Biol..

[5]  Yu Tsao,et al.  Speech enhancement based on deep denoising autoencoder , 2013, INTERSPEECH.

[6]  Zhu-Hong You,et al.  Combining High Speed ELM Learning with a Deep Convolutional Neural Network Feature Encoding for Predicting Protein-RNA Interactions , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Martin W. McBride,et al.  Gene expression profiling in whole blood of patients with coronary artery disease , 2010, Clinical science.

[8]  Xia Li,et al.  Prediction of potential disease-associated microRNAs based on random walk , 2015, Bioinform..

[9]  Xiangxiang Zeng,et al.  Prediction of Potential Disease-Associated MicroRNAs by Using Neural Networks , 2019, Molecular therapy. Nucleic acids.

[10]  Edwin Wang,et al.  Cepred: Predicting the Co-Expression Patterns of the Human Intronic microRNAs with Their Host Genes , 2009, PloS one.

[11]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[12]  Xing Chen,et al.  PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein–Protein Interactions from Protein Sequences , 2017, International journal of molecular sciences.

[13]  Yan Wang,et al.  CpG island methylation status of miRNAs in esophageal squamous cell carcinoma , 2012, International journal of cancer.

[14]  A. Stewart,et al.  Esophageal cancer: results of an American College of Surgeons Patient Care Evaluation Study. , 2000, Journal of the American College of Surgeons.

[15]  C. Carda,et al.  Role of Circulating miRNAs as Biomarkers in Idiopathic Pulmonary Arterial Hypertension: Possible Relevance of miR-23a , 2015, Oxidative medicine and cellular longevity.

[16]  Xing Chen,et al.  HGIMDA: Heterogeneous graph inference for miRNA-disease association prediction , 2016, Oncotarget.

[17]  Yang Li,et al.  HMDD v2.0: a database for experimentally supported human microRNA and disease associations , 2013, Nucleic Acids Res..

[18]  Xiao Li,et al.  A High Efficient Biological Language Model for Predicting Protein–Protein Interactions , 2019, Cells.

[19]  Qionghai Dai,et al.  WBSMDA: Within and Between Score for MiRNA-Disease Association prediction , 2016, Scientific Reports.

[20]  Yadong Wang,et al.  miR2Disease: a manually curated database for microRNA deregulation in human disease , 2008, Nucleic Acids Res..

[21]  Hong-Bin Shen,et al.  MiRGOFS: a GO‐based functional similarity measurement for miRNAs, with applications to the prediction of miRNA subcellular localization and miRNA‐disease association , 2018, Bioinform..

[22]  Xing Chen,et al.  Semi-supervised learning for potential human microRNA-disease associations inference , 2014, Scientific Reports.

[23]  Yadong Wang,et al.  Prioritization of disease microRNAs through a human phenome-microRNAome network , 2010, BMC Systems Biology.

[24]  Zhu-Hong You,et al.  RP-FIRF: Prediction of Self-interacting Proteins Using Random Projection Classifier Combining with Finite Impulse Response Filter , 2018, ICIC.

[25]  J. Ferlay,et al.  Global Cancer Statistics, 2002 , 2005, CA: a cancer journal for clinicians.

[26]  Wei Liu,et al.  E2 regulates MMP-13 via targeting miR-140 in IL-1β-induced extracellular matrix degradation in human chondrocytes , 2016, Arthritis Research & Therapy.

[27]  MengChu Zhou,et al.  Highly Efficient Framework for Predicting Interactions Between Proteins , 2017, IEEE Transactions on Cybernetics.

[28]  Lei Wang,et al.  BNPMDA: Bipartite Network Projection for MiRNA–Disease Association prediction , 2018, Bioinform..

[29]  Xing Chen,et al.  RKNNMDA: Ranking-based KNN for MiRNA-Disease Association prediction , 2017, RNA biology.

[30]  Rongrong Ji,et al.  Sparse auto-encoder based feature learning for human body detection in depth image , 2015, Signal Process..

[31]  Tonghai Jiang,et al.  Predicting Protein Interactions Using a Deep Learning Method-Stacked Sparse Autoencoder Combined with a Probabilistic Classification Vector Machine , 2018, Complex..

[32]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[33]  Hai-Cheng Yi,et al.  A Deep Learning Framework for Robust and Accurate Prediction of ncRNA-Protein Interactions Using Evolutionary Information , 2018, Molecular therapy. Nucleic acids.

[34]  Yufei Huang,et al.  Prediction of microRNAs Associated with Human Diseases Based on Weighted k Most Similar Neighbors , 2013, PloS one.

[35]  V. Ambros The functions of animal microRNAs , 2004, Nature.

[36]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[37]  Zhuhong You,et al.  Accurate Prediction of ncRNA-Protein Interactions From the Integration of Sequence and Evolutionary Information , 2018, Front. Genet..

[38]  A. Intlekofer,et al.  Precision therapy for lymphoma—current state and future directions , 2014, Nature Reviews Clinical Oncology.

[39]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[40]  Xiangxiang Zeng,et al.  Prediction of potential disease-associated microRNAs using structural perturbation method , 2017, bioRxiv.

[41]  A. Jemal,et al.  Global cancer statistics , 2011, CA: a cancer journal for clinicians.

[42]  Haifeng Zhao,et al.  Has-mir-146a rs2910164 polymorphism and risk of immune thrombocytopenia , 2014, Autoimmunity.

[43]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[44]  Luonan Chen,et al.  Network‐Based Prediction of Protein Function , 2009 .

[45]  Hai-Cheng Yi,et al.  Prediction of Self-Interacting Proteins from Protein Sequence Information Based on Random Projection Model and Fast Fourier Transform , 2019, International journal of molecular sciences.

[46]  Gene H. Golub,et al.  Singular value decomposition and least squares solutions , 1970, Milestones in Matrix Computation.

[47]  Chenggang Clarence Yan,et al.  DPFMDA: Distributed and privatized framework for miRNA-Disease association prediction , 2017, Pattern Recognit. Lett..

[48]  A. Jemal,et al.  Global Cancer Statistics , 2011 .

[49]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[50]  C. Croce,et al.  MicroRNA signatures in human cancers , 2006, Nature Reviews Cancer.

[51]  Qihua Tan,et al.  Adenoid cystic carcinomas of the salivary gland, lacrimal gland, and breast are morphologically and genetically similar but have distinct microRNA expression profiles , 2018, Modern Pathology.

[52]  H. Horvitz,et al.  MicroRNA expression profiles classify human cancers , 2005, Nature.

[53]  Martin Reczko,et al.  The database of experimentally supported targets: a functional update of TarBase , 2008, Nucleic Acids Res..

[54]  Jan Gorodkin,et al.  Protein-driven inference of miRNA–disease associations , 2013, Bioinform..

[55]  Yangming Li,et al.  An Improved Deep Forest Model for Predicting Self-Interacting Proteins From Protein Sequence Using Wavelet Transformation , 2019, Front. Genet..

[56]  M. Esteller Non-coding RNAs in human disease , 2011, Nature Reviews Genetics.

[57]  Hai-Cheng Yi,et al.  BGFE: A Deep Learning Model for ncRNA-Protein Interaction Predictions Based on Improved Sequence Information , 2019, International journal of molecular sciences.

[58]  Hugo Larochelle,et al.  An Autoencoder Approach to Learning Bilingual Word Representations , 2014, NIPS.

[59]  Changning Liu,et al.  dbDEMC: a database of differentially expressed miRNAs in human cancers , 2010, BMC Genomics.

[60]  Dong Wang,et al.  Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases , 2010, Bioinform..