Inferring disease and pathway associations of long non-coding RNAs using heterogeneous information network model

Recent findings from biological experiments demonstrate that long non-coding RNAs (lncRNAs) are actively involved in critical cellular processes and are associated with innumerable diseases. Computational prediction of lncRNA-disease association draws tremendous research attention nowadays. This paper proposes a machine learning model that predicts lncRNA-disease associations using Heterogeneous Information Network (HIN) of lncRNAs and diseases. A Support Vector Machine classifier is developed using the feature set extracted from a meta-path-based parameter, Association Index derived from the HIN. Performance of the model is validated using standard statistical metrics and it generated an AUC value of 0.87, which is better than the existing methods in the literature. Results are further validated using the recent literature and many of the predicted lncRNA-disease associations are identified as actually existing. This paper also proposes an HIN-based methodology to associate lncRNAs with pathways in which they may have biological influence. A case study on the pathway associations of four well-known lncRNAs (HOTAIR, TUG1, NEAT1, and MALAT1) has been conducted. It has been observed that many times the same lncRNA is associated with more than one biologically related pathways. Further exploration is needed to substantiate whether such lncRNAs have any role in determining the pathway interplay. The script and sample data for the model construction is freely available at http://bdbl.nitc.ac.in/LncDisPath/index.html.

[1]  Lin Liu,et al.  Inferring novel lncRNA-disease associations based on a random walk model of a lncRNA functional similarity network. , 2014, Molecular bioSystems.

[2]  Junwei Han,et al.  LncRNAs2Pathways: Identifying the pathways influenced by a set of lncRNAs of interest based on a global network propagation method , 2017, Scientific Reports.

[3]  Yixue Li,et al.  Global Prioritizing Disease Candidate lncRNAs via a Multi-level Composite Network , 2017, Scientific Reports.

[4]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[5]  Jianzhong Su,et al.  Analysis of long noncoding RNAs highlights region-specific altered expression patterns and diagnostic roles in Alzheimer's disease , 2019, Briefings Bioinform..

[6]  Peng Wang,et al.  Lnc2Cancer: a manually curated database of experimentally supported lncRNAs associated with various human cancers , 2015, Nucleic Acids Res..

[7]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[8]  Xing Chen,et al.  FMLNCSIM: fuzzy measure-based lncRNA functional similarity calculation model , 2016, Oncotarget.

[9]  A. Frankish,et al.  Towards a complete map of the human long non-coding RNA transcriptome , 2018, Nature Reviews Genetics.

[10]  Wei Wu,et al.  NONCODE 2016: an informative and valuable data source of long non-coding RNAs , 2015, Nucleic Acids Res..

[11]  Andrew D. Rouillard,et al.  Enrichr: a comprehensive gene set enrichment analysis web server 2016 update , 2016, Nucleic Acids Res..

[12]  Pankaj Agarwal,et al.  A Pathway-Based View of Human Diseases and Disease Relationships , 2009, PloS one.

[13]  Minghua Wu,et al.  New insights into long noncoding RNAs and their roles in glioma , 2018, Molecular Cancer.

[14]  Bin Xu,et al.  Long non-coding RNA MALAT1 acts as a competing endogenous RNA to promote malignant melanoma growth and metastasis by sponging miR-22 , 2016, Oncotarget.

[15]  Cong Pian,et al.  Identification of cancer-related miRNA-lncRNA biomarkers using a basic miRNA-lncRNA network , 2018, PloS one.

[16]  Jingpu Zhang,et al.  Integrating Multiple Heterogeneous Networks for Novel LncRNA-Disease Association Inference , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Mark R Chance,et al.  Pathway Analyses and Understanding Disease Associations , 2013, Current Genetic Medicine Reports.

[18]  Xing Chen,et al.  A Computational Framework to Infer Human Disease-Associated Long Noncoding RNAs , 2014, PloS one.

[19]  Bo Liao,et al.  Global network random walk for predicting potential human lncRNA-disease associations , 2017, Scientific Reports.

[20]  David Atlan,et al.  Non-Coding RNAs in Lung Cancer: Contribution of Bioinformatics Analysis to the Development of Non-Invasive Diagnostic Tools , 2016, Genes.

[21]  Ao Li,et al.  TPGLDA: Novel prediction of associations between lncRNAs and diseases via lncRNA-disease-gene tripartite graph , 2018, Scientific Reports.

[22]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[23]  Qionghai Dai,et al.  Constructing lncRNA functional similarity network based on lncRNA-disease associations and disease semantic similarity , 2015, Scientific Reports.

[24]  Philip S. Yu,et al.  Collective Prediction of Multiple Types of Links in Heterogeneous Information Networks , 2014, 2014 IEEE International Conference on Data Mining.

[25]  Xing Chen,et al.  LncRNADisease: a database for long-non-coding RNA-associated diseases , 2012, Nucleic Acids Res..

[26]  Xing Chen,et al.  IRWRLDA: improved random walk with restart for lncRNA-disease association prediction , 2016, Oncotarget.

[27]  Xing Chen,et al.  Novel human lncRNA-disease association inference based on lncRNA expression profiles , 2013, Bioinform..

[28]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .

[29]  Pixu Liu,et al.  Targeting the phosphoinositide 3-kinase pathway in cancer , 2009, Nature Reviews Drug Discovery.

[30]  Qing-Yu He,et al.  DOSE: an R/Bioconductor package for disease ontology semantic and enrichment analysis , 2015, Bioinform..

[31]  Shuangyan Yang,et al.  Long noncoding RNA TUG1 facilitates osteogenic differentiation of periodontal ligament stem cells via interacting with Lin28A , 2018, Cell Death & Disease.

[32]  Liang Cheng,et al.  Long non-coding RNAs in renal cell carcinoma: A systematic review and clinical implications , 2017, Oncotarget.

[33]  Mitsutoshi Nakada,et al.  Aberrant Signaling Pathways in Glioma , 2011, Cancers.

[34]  Philip S. Yu,et al.  A Survey of Heterogeneous Information Network Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[35]  Bin Li,et al.  Exploring functions of long noncoding RNAs across multiple cancers through co-expression network , 2017, Scientific Reports.

[36]  Lei Wang,et al.  A Novel Method for LncRNA-Disease Association Prediction Based on an lncRNA-Disease Association Network , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[37]  Rubo Li,et al.  Long non-coding RNA HOTAIR acts as a competing endogenous RNA to promote malignant melanoma progression by sponging miR-152-3p , 2017, Oncotarget.

[38]  C. Masters,et al.  A systemic view of Alzheimer disease — insights from amyloid-β metabolism beyond the brain , 2017, Nature Reviews Neurology.

[39]  Y. Dang,et al.  Expression and prognostic significance of lncRNA MALAT1 in pancreatic cancer tissues. , 2014, Asian Pacific journal of cancer prevention : APJCP.

[40]  Xing Chen KATZLDA: KATZ measure for the lncRNA-disease association prediction , 2015, Scientific Reports.

[41]  Xiaozhou He,et al.  LncRNA HOTAIR acts as competing endogenous RNA to control the expression of Notch3 via sponging miR-613 in pancreatic cancer , 2017, Oncotarget.