Prediction of lncRNA-disease associations via an embedding learning HOPE in heterogeneous information networks

Uncovering additional long non-coding RNA (lncRNA)-disease associations has become increasingly important for developing treatments for complex human diseases. Identification of lncRNA biomarkers and lncRNA-disease associations is central to diagnoses and treatment. However, traditional experimental methods are expensive and time-consuming. Enormous amounts of data present in public biological databases are available for computational methods used to predict lncRNA-disease associations. In this study, we propose a novel computational method to predict lncRNA-disease associations. More specifically, a heterogeneous network is first constructed by integrating the associations among microRNA (miRNA), lncRNA, protein, drug, and disease, Second, high-order proximity preserved embedding (HOPE) was used to embed nodes into a network. Finally, the rotation forest classifier was adopted to train the prediction model. In the 5-fold cross-validation experiment, the area under the curve (AUC) of our method achieved 0.8328 ± 0.0236. We compare it with the other four classifiers, in which the proposed method remarkably outperformed other comparison methods. Otherwise, we constructed three case studies for three excess death rate cancers, respectively. The results show that 9 (lung cancer, gastric cancer, and hepatocellular carcinomas) out of the top 15 predicted disease-related lncRNAs were confirmed by our method. In conclusion, our method could predict the unknown lncRNA-disease associations effectively.

[1]  Ewan Birney,et al.  Towards practical, high-capacity, low-maintenance information storage in synthesized DNA , 2013, Nature.

[2]  Hsien-Da Huang,et al.  miRTarBase update 2018: a resource for experimentally validated microRNA-target interactions , 2017, Nucleic Acids Res..

[3]  B. Shan,et al.  Functions of lncRNA HOTAIR in lung cancer , 2014, Journal of Hematology & Oncology.

[4]  C. Trautwein,et al.  Disrupted IGF2 promoter control by silencing of promoter P1 in human hepatocellular carcinoma. , 1997, Cancer research.

[5]  Ming Sun,et al.  Long noncoding RNA ANRIL indicates a poor prognosis of gastric cancer and promotes tumor growth by epigenetically silencing of miR-99a/miR-449a , 2014, Oncotarget.

[6]  Qinghua Zhou,et al.  Lung cancer incidence and mortality in China, 2011 , 2015, Thoracic cancer.

[7]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[8]  John O. Woods,et al.  Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses , 2013, PloS one.

[9]  Xing Chen,et al.  PBMDA: A novel and effective path-based computational model for miRNA-disease association prediction , 2017, PLoS Comput. Biol..

[10]  H. Lowe,et al.  Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. , 1994, JAMA.

[11]  Meng Xu,et al.  Long non-coding RNA TUSC7 acts a molecular sponge for miR-10a and suppresses EMT in hepatocellular carcinoma , 2016, Tumor Biology.

[12]  Qiong Zhang,et al.  lncRNASNP2: an updated database of functional SNPs and mutations in human and mouse lncRNAs , 2017, Nucleic Acids Res..

[13]  Peng Wang,et al.  Lnc2Cancer: a manually curated database of experimentally supported lncRNAs associated with various human cancers , 2015, Nucleic Acids Res..

[14]  P. Stadler,et al.  RNA Maps Reveal New RNA Classes and a Possible Function for Pervasive Transcription , 2007, Science.

[15]  Yuan Zhou,et al.  HMDD v3.0: a database for experimentally supported human microRNA–disease associations , 2018, Nucleic Acids Res..

[16]  Rui Xia,et al.  Downregulated long noncoding RNA MEG3 is associated with poor prognosis and promotes cell proliferation in gastric cancer , 2014, Tumor Biology.

[17]  Lin Liu,et al.  Inferring novel lncRNA-disease associations based on a random walk model of a lncRNA functional similarity network. , 2014, Molecular bioSystems.

[18]  Xing Chen,et al.  Predicting lncRNA-disease associations and constructing lncRNA functional similarity network based on the information of miRNA , 2015, Scientific Reports.

[19]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[20]  Damian Szklarczyk,et al.  The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible , 2016, Nucleic Acids Res..

[21]  Zhenggang Zhu,et al.  Overexpression of lncRNA H19 enhances carcinogenesis and metastasis of gastric cancer , 2014, Oncotarget.

[22]  Howard Y. Chang,et al.  Long noncoding RNA HOTAIR reprograms chromatin state to promote cancer metastasis , 2010, Nature.

[23]  David S. Wishart,et al.  DrugBank 5.0: a major update to the DrugBank database for 2018 , 2017, Nucleic Acids Res..

[24]  Xing Chen,et al.  LncRNADisease: a database for long-non-coding RNA-associated diseases , 2012, Nucleic Acids Res..

[25]  S. Brenner,et al.  General Nature of the Genetic Code for Proteins , 1961, Nature.

[26]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2019 , 2018, Nucleic Acids Res..

[27]  Hailin Chen,et al.  Prediction of Associations between OMIM Diseases and MicroRNAs by Random Walk on OMIM Disease Similarity Network , 2013, TheScientificWorldJournal.

[28]  Qinghua Guo,et al.  LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse , 2018, Nucleic Acids Res..

[29]  Hui Xiao,et al.  NONCODE v3.0: integrative annotation of long noncoding RNAs , 2011, Nucleic Acids Res..

[30]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[31]  Francis Crick,et al.  The Genetic Code for Proteins , 1963 .

[32]  M. Mourtada-Maarabouni,et al.  Long non-coding RNA GAS5 regulates apoptosis in prostate cancer cell lines. , 2013, Biochimica et biophysica acta.

[33]  K. Zatloukal,et al.  Characterization of HULC, a novel gene with striking up-regulation in hepatocellular carcinoma, as noncoding RNA. , 2007, Gastroenterology.

[34]  Wei Wang,et al.  Predicting Disease-related Associations by Heterogeneous Network Embedding , 2018, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[35]  Xu Sun,et al.  Long noncoding RNA NEAT1 is an unfavorable prognostic factor and regulates migration and invasion in gastric cancer , 2016, Journal of Cancer Research and Clinical Oncology.

[36]  General , 1970 .