A novel computational model for predicting potential LncRNA-disease associations based on both direct and indirect features of LncRNA-disease pairs

Background Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well. Results In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (fivefold CV), 10-Fold Cross Validation (tenfold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in fivefold CV, tenfold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA. Conclusion The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.

[1]  Yi Pan,et al.  Prediction of lncRNA–disease associations based on inductive matrix completion , 2018, Bioinform..

[2]  Xing Chen,et al.  MDHGI: Matrix Decomposition and Heterogeneous Graph Inference for miRNA-disease association prediction , 2018, PLoS Comput. Biol..

[3]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[4]  Xing Chen KATZLDA: KATZ measure for the lncRNA-disease association prediction , 2015, Scientific Reports.

[5]  Dapeng Hao,et al.  Prioritizing candidate disease-related long non-coding RNAs by walking on the heterogeneous lncRNA and disease network. , 2015, Molecular bioSystems.

[6]  Lin Liu,et al.  Inferring novel lncRNA-disease associations based on a random walk model of a lncRNA functional similarity network. , 2014, Molecular bioSystems.

[7]  Umit Atici,et al.  Prediction of the strength of mineral admixture concrete using multivariable regression analysis and an artificial neural network , 2011, Expert Syst. Appl..

[8]  Yaping Wang,et al.  LncRNA NALT interaction with NOTCH1 promoted cell proliferation in pediatric T cell acute lymphoblastic leukemia , 2015, Scientific Reports.

[9]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[10]  Hui Zhou,et al.  starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein–RNA interaction networks from large-scale CLIP-Seq data , 2013, Nucleic Acids Res..

[11]  Na-Na Guan,et al.  Computational models for lncRNA function prediction and functional similarity calculation , 2018, Briefings in functional genomics.

[12]  Ming Liu,et al.  The lncRNA MALAT1 is a novel biomarker for gastric cancer metastasis , 2016, Oncotarget.

[13]  Xing Chen,et al.  MicroRNAs and complex diseases: from experimental results to computational models , 2019, Briefings Bioinform..

[14]  Elif Bahadir,et al.  Using Neural Network and Logistic Regression Analysis to Predict Prospective Mathematics Teachers' Academic Success upon Entering Graduate Education , 2016 .

[15]  Lei Wang,et al.  IIRWR: Internal Inclined Random Walk With Restart for LncRNA-Disease Association Prediction , 2019, IEEE Access.

[16]  Yue Zhao,et al.  MNDR v2.0: an updated resource of ncRNA–disease associations in mammals , 2017, Nucleic Acids Res..

[17]  Junyi Yu,et al.  Circulating 'lncRNA OTTHUMT00000387022' from monocytes as a novel biomarker for coronary artery disease. , 2016, Cardiovascular research.

[18]  Xiaoqiang Guo,et al.  Long non-coding RNAs: emerging players in gastric cancer , 2014, Tumor Biology.

[19]  Xing Chen,et al.  Long non-coding RNAs and complex diseases: from experimental results to computational models , 2016, Briefings Bioinform..

[20]  Xing Chen,et al.  Integrating random walk and binary regression to identify novel miRNA-disease association , 2019, BMC Bioinformatics.

[21]  Lei Wang,et al.  A novel collaborative filtering model for LncRNA-disease association prediction based on the Naïve Bayesian classifier , 2019, BMC Bioinformatics.

[22]  Rory Johnson Long non-coding RNAs in Huntington's disease neurodegeneration , 2012, Neurobiology of Disease.

[23]  Lei Wang,et al.  BNPMDA: Bipartite Network Projection for MiRNA–Disease Association prediction , 2018, Bioinform..

[24]  Yang Li,et al.  HMDD v2.0: a database for experimentally supported human microRNA and disease associations , 2013, Nucleic Acids Res..

[25]  Z. Xuan,et al.  Long Non-Coding RNAs and Complex Human Diseases , 2013, International journal of molecular sciences.

[26]  Xing Chen,et al.  Deep-belief network for predicting potential miRNA-disease associations , 2020, Briefings Bioinform..

[27]  Xinghua Shi,et al.  A Network Based Method for Analysis of lncRNA-Disease Associations and Prediction of lncRNAs Implicated in Diseases , 2014, PloS one.

[28]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[29]  Xing Chen,et al.  LncRNADisease: a database for long-non-coding RNA-associated diseases , 2012, Nucleic Acids Res..

[30]  Na-Na Guan,et al.  Predicting miRNA‐disease association based on inductive matrix completion , 2018, Bioinform..

[31]  Lei Wang,et al.  A Novel Probability Model for LncRNA–Disease Association Prediction Based on the Naïve Bayesian Classifier , 2018, Genes.

[32]  Giuseppe Basso,et al.  LncRNA Expression Discriminates Karyotype and Predicts Survival in B-Lymphoblastic Leukemia , 2015, Molecular Cancer Research.

[33]  M. Esteller Non-coding RNAs in human disease , 2011, Nature Reviews Genetics.

[34]  Xing Chen,et al.  Drug-pathway association prediction: from experimental results to computational models , 2020, Briefings Bioinform..

[35]  Ertuğrul Çam,et al.  Forecasting electricity consumption: A comparison of regression analysis, neural networks and least squares support vector machines , 2015 .

[36]  Thomas D. Schmittgen,et al.  Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas. , 2007, Cancer cell.

[37]  Dong Wang,et al.  Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases , 2010, Bioinform..

[38]  Howard Y. Chang,et al.  Molecular mechanisms of long noncoding RNAs. , 2011, Molecular cell.

[39]  Lei Wang,et al.  A Probabilistic Matrix Factorization Method for Identifying lncRNA-Disease Associations , 2019, Genes.

[40]  Xing Chen,et al.  Ensemble of decision tree reveals potential miRNA-disease associations , 2019, PLoS Comput. Biol..

[41]  Xing Chen,et al.  Novel human lncRNA-disease association inference based on lncRNA expression profiles , 2013, Bioinform..

[42]  John S. Mattick,et al.  lncRNAdb: a reference database for long noncoding RNAs , 2010, Nucleic Acids Res..

[43]  J. Mattick,et al.  Long non-coding RNAs: insights into functions , 2009, Nature Reviews Genetics.

[44]  Howard Y. Chang,et al.  Long noncoding RNAs and human disease. , 2011, Trends in cell biology.

[45]  Tim R. Mercer,et al.  NRED: a database of long noncoding RNA expression , 2008, Nucleic Acids Res..

[46]  H. Espejo,et al.  [Gastric cancer]. , 1996, Revista de gastroenterologia del Peru : organo oficial de la Sociedad de Gastroenterologia del Peru.

[47]  Tao Chen,et al.  Back propagation neural network with adaptive differential evolution algorithm for time series forecasting , 2015, Expert Syst. Appl..

[48]  Xing Chen,et al.  MicroRNA-small molecule association identification: from experimental results to computational models , 2018, Briefings Bioinform..

[49]  Hui Xiao,et al.  NONCODE v3.0: integrative annotation of long noncoding RNAs , 2011, Nucleic Acids Res..

[50]  Yongdong Zhang,et al.  Drug-target interaction prediction: databases, web servers and computational models , 2016, Briefings Bioinform..

[51]  Fenghua Wang,et al.  Long non-coding RNA XIST regulates gastric cancer progression by acting as a molecular sponge of miR-101 to modulate EZH2 expression , 2016, Journal of Experimental & Clinical Cancer Research.

[52]  Xing Chen,et al.  Predicting lncRNA-disease associations and constructing lncRNA functional similarity network based on the information of miRNA , 2015, Scientific Reports.

[53]  Xing Chen,et al.  A Computational Framework to Infer Human Disease-Associated Long Noncoding RNAs , 2014, PloS one.

[54]  H. Hansen,et al.  Lung cancer. , 1990, Cancer chemotherapy and biological response modifiers.