Inferring Potential CircRNA–Disease Associations via Deep Autoencoder-Based Classification

AIM Circular RNAs (circRNA) are endogenous non-coding RNA molecules with a stable circular conformation. Growing evidence from recent experiments reveals that dysregulations and abnormal expressions of circRNAs are correlated with complex diseases. Therefore, identifying the causal circRNAs behind diseases is invaluable in explaining the disease pathogenesis. Since biological experiments are difficult, slow-progressing, and prohibitively expensive, computational approaches are necessary for identifying the relationships between circRNAs and diseases. METHODS We propose an ensemble method called AE-RF, based on a deep autoencoder and random forest classifier, to predict potential circRNA-disease associations. The method first integrates circRNA and disease similarities to construct features. The integrated features are sent to the deep autoencoder, to extract hidden biological patterns. With the extracted deep features, the random forest classifier is trained for association prediction. RESULTS AND DISCUSSION AE-RF achieved AUC scores of 0.9486 and 0.9522, in fivefold and tenfold cross-validation experiments, respectively. We conducted case studies on the top-most predicted results and three common human cancers. We compared the method with state-of-the-art classifiers and related methods. The experimental results and case studies demonstrate the prediction power of the model, and it outperforms previous methods with high degree of robustness. Training the classifier with the unique features retrieved by the autoencoder enhanced the model's predictive performance. The top predicted circRNAs are promising candidates for further biological tests.

[1]  Andrew P. Bradley,et al.  The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..

[2]  A. S. Jereesh,et al.  Drug repositioning based on the target microRNAs using bilateral-inductive matrix completion , 2020, Molecular Genetics and Genomics.

[3]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[4]  Qing-Yu He,et al.  DOSE: an R/Bioconductor package for disease ontology semantic and enrichment analysis , 2015, Bioinform..

[5]  Jingpu Zhang,et al.  Integrating Multiple Heterogeneous Networks for Novel LncRNA-Disease Association Inference , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Lei Yu,et al.  The Circular RNA Cdr1as Act as an Oncogene in Hepatocellular Carcinoma through Targeting miR-7 Expression , 2016, PloS one.

[7]  Gang Feng,et al.  Disease Ontology: a backbone for disease semantic integration , 2011, Nucleic Acids Res..

[8]  Yi Pan,et al.  Prediction of lncRNA–disease associations based on inductive matrix completion , 2018, Bioinform..

[9]  Elena Marchiori,et al.  Gaussian interaction profile kernels for predicting drug-target interaction , 2011, Bioinform..

[10]  E. Zeggini,et al.  Functional annotation of non-coding sequence variants , 2014, Nature Methods.

[11]  Xinghua Lu,et al.  Learning a hierarchical representation of the yeast transcriptomic machinery using an autoencoder model , 2016, BMC Bioinformatics.

[12]  Xubao Liu,et al.  Circular RNAs: a new frontier in the study of human diseases , 2016, Journal of Medical Genetics.

[13]  Hecheng Zhou,et al.  CircRNA: functions and properties of a novel potential biomarker for cancer , 2017, Molecular Cancer.

[14]  Xiangxiang Zeng,et al.  Inferring MicroRNA-Disease Associations by Random Walk on a Heterogeneous Network with Multiple Data Sources , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[15]  Jianye Hao,et al.  A learning-based framework for miRNA-disease association identification using neural networks , 2018, bioRxiv.

[16]  Qionghai Dai,et al.  Constructing lncRNA functional similarity network based on lncRNA-disease associations and disease semantic similarity , 2015, Scientific Reports.

[17]  Tao Jiang,et al.  circRNA disease: a manually curated database of experimentally supported circRNA-disease associations , 2018, Cell Death & Disease.

[18]  Tatsuya Akutsu,et al.  Controllability Methods for Identifying Associations Between Critical Control ncRNAs and Human Diseases. , 2019, Methods in molecular biology.

[19]  Laiyi Fu,et al.  A deep ensemble model to predict miRNA-disease association , 2017, Scientific Reports.

[20]  G. Pazour,et al.  Ror2 signaling regulates Golgi structure and transport through IFT20 for tumor invasiveness , 2017, Scientific Reports.

[21]  Dong Wang,et al.  Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases , 2010, Bioinform..

[22]  D. Riesner,et al.  Viroids are single-stranded covalently closed circular RNA molecules existing as highly base-paired rod-like structures. , 1976, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Qi Zhao,et al.  Integrating Bipartite Network Projection and KATZ Measure to Identify Novel CircRNA-Disease Associations , 2019, IEEE Transactions on NanoBioscience.

[24]  Lei Wang,et al.  An efficient approach based on multi-sources information to predict circRNA-disease associations using deep convolutional neural network , 2019, Bioinform..

[25]  Ji Zhu,et al.  Improved Classification of Mass Spectrometry Database Search Results Using Newer Machine Learning Approaches* , 2006, Molecular & Cellular Proteomics.

[26]  Maozu Guo,et al.  Computational Approaches in Detecting Non- Coding RNA , 2013, Current genomics.

[27]  K. Hudson-Edwards,et al.  China’s most typical nonferrous organic-metal facilities own specific microbial communities , 2018, Scientific Reports.

[28]  M. Mildner,et al.  Re-epithelialization and immune cell behaviour in an ex vivo human skin model , 2020, Scientific Reports.

[29]  Xiujuan Lei,et al.  Integrating random walk with restart and k-Nearest Neighbor to identify novel circRNA-disease association , 2020, Scientific Reports.

[30]  Yufei Huang,et al.  Prediction of microRNAs Associated with Human Diseases Based on Weighted k Most Similar Neighbors , 2013, PloS one.

[31]  Chun-Ying Yu,et al.  The emerging roles and functions of circular RNAs and their generation , 2019, Journal of Biomedical Science.

[32]  Tao Sun,et al.  Regulatory Role of Circular RNAs and Neurological Disorders , 2017, Molecular Neurobiology.

[33]  G. Lang,et al.  SPAG6 and L1TD1 are transcriptionally regulated by DNA methylation in non-small cell lung cancers , 2017, Molecular Cancer.

[34]  Zhaohui S. Qin,et al.  DIVAN: accurate identification of non-coding disease-specific risk variants using multi-omics profiles , 2016, Genome Biology.

[35]  Boonserm Kaewkamnerdpong,et al.  Identification of non-coding RNAs with a new composite feature in the Hybrid Random Forest Ensemble algorithm , 2014, Nucleic acids research.

[36]  M. Esteller Non-coding RNAs in human disease , 2011, Nature Reviews Genetics.

[37]  Yuan Zhang,et al.  LncRNA-ID: Long non-coding RNA IDentification using balanced random forests , 2015, Bioinform..

[38]  Xiujuan Lei,et al.  CircR2Disease: a manually curated database for experimentally supported circular RNAs associated with various diseases , 2018, Database J. Biol. Databases Curation.

[39]  Cheng Liang,et al.  NCPCDA: network consistency projection for circRNA–disease association prediction , 2019, RSC advances.

[40]  Rajeev Kumar,et al.  Receiver operating characteristic (ROC) curve for medical researchers , 2011, Indian pediatrics.

[41]  Zhu-Hong You,et al.  GCNCDA: A new method for predicting circRNA-disease associations based on Graph Convolutional Network Algorithm , 2020, PLoS computational biology.

[42]  Hang Wei,et al.  iCircDA-MF: identification of circRNA-disease associations based on matrix factorization , 2019, Briefings Bioinform..

[43]  Junjie Xiao,et al.  Circular RNAs: Promising Biomarkers for Human Diseases , 2018, EBioMedicine.

[44]  Fang-Xiang Wu,et al.  Prediction of CircRNA-Disease Associations Using KATZ Model Based on Heterogeneous Networks , 2018, International journal of biological sciences.

[45]  Steven G. Gray,et al.  Circular RNAs: Biogenesis, Function and Role in Human Diseases , 2017, Front. Mol. Biosci..

[46]  Y. Ai,et al.  Microarray Analysis of Circular RNA Expression Profile Associated with 5-Fluorouracil-Based Chemoradiation Resistance in Colorectal Cancer Cells , 2017, BioMed research international.

[47]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[48]  R. Parker,et al.  Circular RNAs: diversity of form and function , 2014, RNA.

[49]  Fang-Xiang Wu,et al.  DWNN-RLS: regularized least squares method for predicting circRNA-disease associations , 2018, BMC Bioinformatics.

[50]  Yong Song,et al.  PRC1 contributes to tumorigenesis of lung adenocarcinoma in association with the Wnt/β-catenin signaling pathway , 2017, Molecular Cancer.

[51]  C. Greene,et al.  ADAGE-Based Integration of Publicly Available Pseudomonas aeruginosa Gene Expression Data with Denoising Autoencoders Illuminates Microbe-Host Interactions , 2016, mSystems.

[52]  Kun Xu,et al.  Identification of circular RNAs as a promising new class of diagnostic biomarkers for human breast cancer , 2017, Oncotarget.

[53]  Anne E Carpenter,et al.  CP-CHARM: segmentation-free image classification made accessible , 2016, BMC Bioinformatics.

[54]  George E. Liu,et al.  Genome-wide analysis reveals differential selection involved with copy number variation in diverse Chinese Cattle , 2017, Scientific Reports.

[55]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[56]  Pierre Baldi,et al.  Deep autoencoder neural networks for gene ontology annotation predictions , 2014, BCB.

[57]  Qi Zhao,et al.  Predicting human disease-associated circRNAs based on locality-constrained linear coding. , 2020, Genomics.

[58]  Walter J. Lukiw,et al.  Circular RNA (circRNA) in Alzheimer's disease (AD) , 2013, Front. Genet..

[59]  P. F. Vasconcelos,et al.  In situ immune response and mechanisms of cell damage in central nervous system of fatal cases microcephaly by Zika virus , 2018, Scientific Reports.

[60]  Yan Lu,et al.  Circ2Disease: a manually curated database of experimentally validated circRNAs in human disease , 2018, Scientific Reports.

[61]  J. Mandrekar Receiver operating characteristic curve in diagnostic test assessment. , 2010, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[62]  Jiancheng Zhong,et al.  An in-silico method with graph-based multi-label learning for large-scale predicting circRNA-disease associations. , 2020, Genomics.

[63]  Xing Chen,et al.  MicroRNAs and complex diseases: from experimental results to computational models , 2019, Briefings Bioinform..

[64]  Xiujuan Lei,et al.  PWCDA: Path Weighted Method for Predicting circRNA-Disease Associations , 2018, International journal of molecular sciences.

[65]  K Deepthi,et al.  An Ensemble Approach for CircRNA-Disease association prediction based on Autoencoder and Deep Neural Network. , 2020, Gene.

[66]  Julia Salzman,et al.  Cell-Type Specific Features of Circular RNA Expression , 2013, PLoS genetics.

[67]  Bin Li,et al.  Estimation of Regional Economic Development Indicator from Transportation Network Analytics , 2020, Scientific Reports.

[68]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[69]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[70]  Senlin Zhao,et al.  PCDHGA9 acts as a tumor suppressor to induce tumor cell apoptosis and autophagy and inhibit the EMT process in human gastric cancer , 2018, Cell Death & Disease.