CRPGCN: predicting circRNA-disease associations using graph convolutional network based on heterogeneous network

Background The existing studies show that circRNAs can be used as a biomarker of diseases and play a prominent role in the treatment and diagnosis of diseases. However, the relationships between the vast majority of circRNAs and diseases are still unclear, and more experiments are needed to study the mechanism of circRNAs. Nowadays, some scholars use the attributes between circRNAs and diseases to study and predict their associations. Nonetheless, most of the existing experimental methods use less information about the attributes of circRNAs, which has a certain impact on the accuracy of the final prediction results. On the other hand, some scholars also apply experimental methods to predict the associations between circRNAs and diseases. But such methods are usually expensive and time-consuming. Based on the above shortcomings, follow-up research is needed to propose a more efficient calculation-based method to predict the associations between circRNAs and diseases. Results In this study, a novel algorithm (method) is proposed, which is based on the Graph Convolutional Network (GCN) constructed with Random Walk with Restart (RWR) and Principal Component Analysis (PCA) to predict the associations between circRNAs and diseases (CRPGCN). In the construction of CRPGCN, the RWR algorithm is used to improve the similarity associations of the computed nodes with their neighbours. After that, the PCA method is used to dimensionality reduction and extract features, it makes the connection between circRNAs with higher similarity and diseases closer. Finally, The GCN algorithm is used to learn the features between circRNAs and diseases and calculate the final similarity scores, and the learning datas are constructed from the adjacency matrix, similarity matrix and feature matrix as a heterogeneous adjacency matrix and a heterogeneous feature matrix. Conclusions After 2-fold cross-validation, 5-fold cross-validation and 10-fold cross-validation, the area under the ROC curve of the CRPGCN is 0.9490, 0.9720 and 0.9722, respectively. The CRPGCN method has a valuable effect in predict the associations between circRNAs and diseases.

[1]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[2]  Xiongwen Quan,et al.  A representation learning model based on variational inference and graph autoencoder for predicting lncRNA-disease associations , 2021, BMC Bioinformatics.

[3]  Xiujuan Lei,et al.  Predicting novel CircRNA-disease associations based on random walk and logistic regression model , 2020, Comput. Biol. Chem..

[4]  Fang-Xiang Wu,et al.  Improving circRNA-disease association prediction by sequence and ontology representations with convolutional and recurrent neural networks , 2020, Bioinform..

[5]  S. Bortoluzzi,et al.  CircIMPACT: An R Package to Explore Circular RNA Impact on Gene Expression and Pathways , 2021, Genes.

[6]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[7]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[8]  Xiujuan Lei,et al.  Integrating random walk with restart and k-Nearest Neighbor to identify novel circRNA-disease association , 2020, Scientific Reports.

[9]  H. J. Jeffrey Chaos game representation of gene structure. , 1990, Nucleic acids research.

[10]  Zhu-Hong You,et al.  GCNCDA: A new method for predicting circRNA-disease associations based on Graph Convolutional Network Algorithm , 2020, PLoS computational biology.

[11]  Chee Keong Kwoh,et al.  Predicting Human Microbe-Drug Associations via Graph Convolutional Network with Conditional Random Field , 2020, Bioinform..

[12]  L. Deng,et al.  MSCNE:Predict miRNA-Disease Associations Using Neural Network Based on Multi-Source Biological Information , 2021, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[13]  Meixi Wang,et al.  MRWMDA: A novel framework to infer miRNA-disease associations , 2020, Biosyst..

[14]  Jiawei Luo,et al.  Predicting human microbe-disease associations via graph attention networks with inductive matrix completion , 2020, Briefings Bioinform..

[15]  Jian-Qiang Li,et al.  iCDA-CGR: Identification of circRNA-disease associations based on Chaos Game Representation , 2020, PLoS Comput. Biol..

[16]  Xiujuan Lei,et al.  CircR2Disease: a manually curated database for experimentally supported circular RNAs associated with various diseases , 2018, Database J. Biol. Databases Curation.

[17]  Hang Wei,et al.  iCircDA-MF: identification of circRNA-disease associations based on matrix factorization , 2019, Briefings Bioinform..

[18]  Tao Liu,et al.  Neural inductive matrix completion with graph convolutional networks for miRNA-disease association prediction , 2020, Bioinform..

[19]  Petar Glažar,et al.  circBase: a database for circular RNAs , 2014, RNA.

[20]  Ming Chen,et al.  CircFunBase: a database for functional circular RNAs , 2019, Database J. Biol. Databases Curation.

[21]  Yong Wang,et al.  IPCARF: improving lncRNA-disease association prediction using incremental principal component analysis feature selection and a random forest classifier , 2021, BMC Bioinform..

[22]  Zhen Gao,et al.  AEMDA: inferring miRNA-disease associations based on deep autoencoder , 2020, Bioinform..

[23]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[24]  Reda Alhajj,et al.  SNF-NN: computational method to predict drug-disease interactions using similarity network fusion and neural networks , 2021, BMC Bioinform..

[25]  Yi Pan,et al.  Predicting CircRNA-Disease Associations Based on Improved Weighted Biased Meta-Structure , 2021, Journal of Computer Science and Technology.

[26]  Soroush Vosoughi,et al.  Embedding Node Structural Role Identity into Hyperbolic Space , 2020, CIKM.

[27]  Hong-Bin Shen,et al.  Scoring disease-microRNA associations by integrating disease hierarchy into graph convolutional networks , 2020, Pattern Recognit..

[28]  Xiaodan Zhong,et al.  A novel end-to-end method to predict RNA secondary structure profile based on bidirectional LSTM and residual neural network , 2021, BMC Bioinform..

[29]  Junfeng Xia,et al.  Prediction of circRNA-disease associations based on inductive matrix completion , 2020, BMC Medical Genomics.

[30]  M. Shekari,et al.  The expanding role of CDR1‐AS in the regulation and development of cancer and human diseases , 2020, Journal of cellular physiology.

[31]  Lei Wang,et al.  IIRWR: Internal Inclined Random Walk With Restart for LncRNA-Disease Association Prediction , 2019, IEEE Access.

[32]  Witold Pedrycz,et al.  Prediction of disease-associated circRNAs via circRNA-disease pair graph and weighted nuclear norm minimization , 2021, Knowl. Based Syst..

[33]  Zhu-Hong You,et al.  SGANRDA: semi-supervised generative adversarial networks for predicting circRNA-disease associations , 2021, Briefings Bioinform..

[34]  Yong Xu,et al.  iCircDA-LTR: identification of circRNA-disease associations based on Learning to Rank , 2021, Bioinform..

[35]  Christos Faloutsos,et al.  Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[36]  Ayodele Adebiyi,et al.  PCA Model For RNA-Seq Malaria Vector Data Classification Using KNN And Decision Tree Algorithm , 2020, 2020 International Conference in Mathematics, Computer Engineering and Computer Science (ICMCECS).

[37]  Lei Deng,et al.  SDN2GO: An Integrated Deep Learning Model for Protein Function Prediction , 2020, Frontiers in Bioengineering and Biotechnology.

[38]  Zhu-Hong You,et al.  Graph convolution for predicting associations between miRNA and drug resistance , 2019, Bioinform..

[39]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[40]  Jiaqi Wang,et al.  GBDTL2E: Predicting lncRNA-EF Associations Using Diffusion and HeteSim Features Based on a Heterogeneous Network , 2020, Frontiers in Genetics.

[41]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[42]  K Deepthi,et al.  Inferring Potential CircRNA–Disease Associations via Deep Autoencoder-Based Classification , 2020, Molecular Diagnosis & Therapy.

[43]  Xiujuan Lei,et al.  CircRNA-disease associations prediction based on metapath2vec++ and matrix factorization , 2020, Big Data Min. Anal..

[44]  D. Prough,et al.  Principal component analysis of blood microRNA datasets facilitates diagnosis of diverse diseases , 2020, PloS one.

[45]  L. Deng,et al.  PMDFI: Predicting miRNA–Disease Associations Based on High-Order Feature Interaction , 2021, Frontiers in Genetics.

[46]  Xiujuan Lei,et al.  Identifying Cancer genes by combining two-rounds RWR based on multiple biological data , 2019, BMC Bioinformatics.