CGMDA: An Approach to Predict and Validate MicroRNA-Disease Associations by Utilizing Chaos Game Representation and LightGBM

Recent studies have shown that microRNAs (miRNAs) play an important role in complex human diseases. Identifying potential miRNA-disease associations is useful for understanding the pathogenesis. However, there are currently only a few methods proposed to predict miRNA-disease association based on sequence information. And these methods can only quantify nonlinear sequence relationships without taking linear sequence information into account. In this work, we designed a computational method for predicting miRNA-disease association based on chaos game representation, called CGMDA, to overcome these problems. CGMDA combines association information with miRNA sequence information, miRNA functional information and disease semantic information to improve prediction accuracy. In particular, we use chaos game representation (CGR) technology for the first time to transform miRNA sequence information into image information and extract its features. In the cross-validation experiment, CGMDA achieved a mean the area under the receiver operating characteristic curve (AUC) of 0.9099 on the HMDD v3.0 data set. To better evaluate the performance of CGMDA, we compared it to different classifiers and related prediction methods. In addition, CGMDA is applied to three human complex diseases. The results showed that of the top 40 disease-related miRNAs predicted, 39 (Breast Neoplasm), 39 (Lymphoma) and 38 (Colon Neoplasm) were validated by experiments in case studies. These experimental results show that CGMDA is a reliable tool and has potential application prospects in assisting early diagnosis and treatment of prognosis.

[1]  Wei Liu,et al.  E2 regulates MMP-13 via targeting miR-140 in IL-1β-induced extracellular matrix degradation in human chondrocytes , 2016, Arthritis Research & Therapy.

[2]  H. J. Jeffrey Chaos game representation of gene structure. , 1990, Nucleic acids research.

[3]  Haifeng Zhao,et al.  Has-mir-146a rs2910164 polymorphism and risk of immune thrombocytopenia , 2014, Autoimmunity.

[4]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[5]  Edwin Wang,et al.  Cepred: Predicting the Co-Expression Patterns of the Human Intronic microRNAs with Their Host Genes , 2009, PloS one.

[6]  Xia Li,et al.  Walking the interactome to identify human miRNA-disease associations through the functional link between miRNA targets and disease genes , 2013, BMC Systems Biology.

[7]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[8]  Yang Li,et al.  HMDD v2.0: a database for experimentally supported human microRNA and disease associations , 2013, Nucleic Acids Res..

[9]  Stijn van Dongen,et al.  miRBase: tools for microRNA genomics , 2007, Nucleic Acids Res..

[10]  Yun Xiao,et al.  Prioritizing Candidate Disease miRNAs by Topological Features in the miRNA Target–Dysregulated Network: Case Study of Prostate Cancer , 2011, Molecular Cancer Therapeutics.

[11]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[12]  V. Ambros,et al.  The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14 , 1993, Cell.

[13]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[14]  Thomas Thum,et al.  Cardiovascular Importance of the MicroRNA‐23/27/24 Family , 2012, Microcirculation.

[15]  C. Carda,et al.  Role of Circulating miRNAs as Biomarkers in Idiopathic Pulmonary Arterial Hypertension: Possible Relevance of miR-23a , 2015, Oxidative medicine and cellular longevity.

[16]  Qionghai Dai,et al.  WBSMDA: Within and Between Score for MiRNA-Disease Association prediction , 2016, Scientific Reports.

[17]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[18]  B. Reinhart,et al.  The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans , 2000, Nature.

[19]  M. Esteller Non-coding RNAs in human disease , 2011, Nature Reviews Genetics.

[20]  A D Carothers,et al.  Cancer risk associated with germline DNA mismatch repair gene mutations. , 1997, Human molecular genetics.

[21]  Xing Chen,et al.  RWRMDA: predicting novel human microRNA-disease associations. , 2012, Molecular bioSystems.

[22]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[23]  Changning Liu,et al.  dbDEMC: a database of differentially expressed miRNAs in human cancers , 2010, BMC Genomics.

[24]  S. Koren,et al.  Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation , 2016, bioRxiv.

[25]  Dong Wang,et al.  Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases , 2010, Bioinform..

[26]  Chenggang Clarence Yan,et al.  DPFMDA: Distributed and privatized framework for miRNA-Disease association prediction , 2017, Pattern Recognit. Lett..

[27]  Qihua Tan,et al.  Adenoid cystic carcinomas of the salivary gland, lacrimal gland, and breast are morphologically and genetically similar but have distinct microRNA expression profiles , 2018, Modern Pathology.

[28]  Yun Xiao,et al.  Prioritizing candidate disease miRNAs by integrating phenotype associations of multiple diseases with matched miRNA and mRNA expression profiles. , 2014, Molecular bioSystems.

[29]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[30]  Xing Chen,et al.  MCMDA: Matrix completion for MiRNA-disease association prediction , 2017, Oncotarget.

[31]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[32]  V. Ambros The functions of animal microRNAs , 2004, Nature.

[33]  R. Sharan,et al.  Network-based prediction of protein function , 2007, Molecular systems biology.

[34]  Yufei Huang,et al.  Prediction of microRNAs Associated with Human Diseases Based on Weighted k Most Similar Neighbors , 2013, PloS one.

[35]  Yadong Wang,et al.  miR2Disease: a manually curated database for microRNA deregulation in human disease , 2008, Nucleic Acids Res..

[36]  Hong-Bin Shen,et al.  MiRGOFS: a GO‐based functional similarity measurement for miRNAs, with applications to the prediction of miRNA subcellular localization and miRNA‐disease association , 2018, Bioinform..

[37]  Q. Cui,et al.  An Analysis of Human MicroRNA and Disease Associations , 2008, PloS one.

[38]  Xing Chen,et al.  LMTRDA: Using logistic model tree to predict MiRNA-disease associations by fusing multi-source information of sequences and similarities , 2019, PLoS Comput. Biol..

[39]  Xing Chen,et al.  Adaptive boosting-based computational model for predicting potential miRNA-disease associations , 2019, Bioinform..

[40]  Martin W. McBride,et al.  Gene expression profiling in whole blood of patients with coronary artery disease , 2010, Clinical science.