Rise of Deep Learning Clinical Applications and Challenges in Omics Data: A Systematic Review

This research aims to review and evaluate the most relevant scientific studies about deep learning (DL) models in the omics field. It also aims to realize the potential of DL techniques in omics data analysis fully by demonstrating this potential and identifying the key challenges that must be addressed. Numerous elements are essential for comprehending numerous studies by surveying the existing literature. For example, the clinical applications and datasets from the literature are essential elements. The published literature highlights the difficulties encountered by other researchers. In addition to looking for other studies, such as guidelines, comparative studies, and review papers, a systematic approach is used to search all relevant publications on omics and DL using different keyword variants. From 2018 to 2022, the search procedure was conducted on four Internet search engines: IEEE Xplore, Web of Science, ScienceDirect, and PubMed. These indexes were chosen because they offer enough coverage and linkages to numerous papers in the biological field. A total of 65 articles were added to the final list. The inclusion and exclusion criteria were specified. Of the 65 publications, 42 are clinical applications of DL in omics data. Furthermore, 16 out of 65 articles comprised the review publications based on single- and multi-omics data from the proposed taxonomy. Finally, only a small number of articles (7/65) were included in papers focusing on comparative analysis and guidelines. The use of DL in studying omics data presented several obstacles related to DL itself, preprocessing procedures, datasets, model validation, and testbed applications. Numerous relevant investigations were performed to address these issues. Unlike other review papers, our study distinctly reflects different observations on omics with DL model areas. We believe that the result of this study can be a useful guideline for practitioners who look for a comprehensive view of the role of DL in omics data analysis.

[1]  Karrar Hameed Abdulkareem,et al.  A hybrid cancer prediction based on multi-omics data and reinforcement learning state action reward state action (SARSA) , 2023, Computers in Biology and Medicine.

[2]  Karrar Hameed Abdulkareem,et al.  Enhanced Heart Disease Prediction Based on Machine Learning and χ2 Statistical Optimal Feature Selection Model , 2022, Designs.

[3]  Karrar Hameed Abdulkareem,et al.  MEF: Multidimensional Examination Framework for Prioritization of COVID-19 Severe Patients and Promote Precision Medicine Based on Hybrid Multi-Criteria Decision-Making Approaches , 2022, Bioengineering.

[4]  M. Qiu,et al.  MCluster-VAEs: An end-to-end variational deep learning-based clustering method for subtype discovery using multi-omics data , 2022, Comput. Biol. Medicine.

[5]  Qiang Xu,et al.  New opportunities and challenges of natural products research: When target identification meets single-cell multiomics , 2022, Acta pharmaceutica Sinica. B.

[6]  Seifedine Kadry,et al.  Smart Healthcare System for Severity Prediction and Critical Tasks Management of COVID-19 Patients in IoT-Fog Computing Environments , 2022, Computational intelligence and neuroscience.

[7]  J. Cheong,et al.  Machine Learning Predictor of Immune Checkpoint Blockade Response in Gastric Cancer , 2022, Cancers.

[8]  Amal Alqahtani Application of Artificial Intelligence in Discovery and Development of Anticancer and Antidiabetic Therapeutic Agents , 2022, Evidence-based complementary and alternative medicine : eCAM.

[9]  A. Allegra,et al.  A machine learning analysis to predict the response to intravenous and subcutaneous immunoglobulin in inflammatory myopathies. A proposal for a future multi-omics approach in autoimmune diseases. , 2022, Autoimmunity reviews.

[10]  Hao Li,et al.  Inferring transcription factor regulatory networks from single-cell ATAC-seq data based on graph neural networks , 2022, Nature Machine Intelligence.

[11]  Karrar Hameed Abdulkareem,et al.  Automated System for Identifying COVID-19 Infections in Computed Tomography Images Using Deep Learning Models , 2022, Journal of healthcare engineering.

[12]  B. Hanczar,et al.  GraphGONet: a self-explaining neural network encapsulating the Gene Ontology graph for phenotype prediction on gene expression. , 2022, Bioinformatics.

[13]  Yingjun Ma DeepMNE: Deep Multi-Network Embedding for lncRNA-Disease Association Prediction , 2022, IEEE Journal of Biomedical and Health Informatics.

[14]  H. Binder,et al.  Interpretable generative deep learning: an illustration with single cell gene expression data , 2022, Human Genetics.

[15]  S. Saha,et al.  DeePROG: Deep Attention-Based Model for Diseased Gene Prognosis by Fusing Multi-Omics Data , 2021, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  P. Cournède,et al.  Representation Learning for the Clustering of Multi-Omics Data , 2021, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Marco Masseroli,et al.  Investigating Deep Learning Based Breast Cancer Subtyping Using Pan-Cancer and Multi-Omic Data , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[18]  Jakob Nikolas Kather,et al.  Integration of deep learning-based image analysis and genomic data in cancer pathology: A systematic review. , 2021, European journal of cancer.

[19]  Ahsan Bin Tufail,et al.  Deep Learning in Cancer Diagnosis and Prognosis Prediction: A Minireview on Challenges, Recent Trends, and Future Directions , 2021, Computational and mathematical methods in medicine.

[20]  Maha A. Thafar,et al.  Machine learning and deep learning methods that use omics data for metastasis prediction , 2021, Computational and structural biotechnology journal.

[21]  Yi Wang,et al.  Predicting bladder cancer prognosis by integrating multi-omics data through a transfer learning-based Cox proportional hazards network , 2021, CCF Transactions on High Performance Computing.

[22]  Xiu-juan Lei,et al.  Association predictions of genomics, proteinomics, transcriptomics, microbiome, metabolomics, pathomics, radiomics, drug, symptoms, environment factor, and disease networks: A comprehensive approach , 2021, Medicinal research reviews.

[23]  S. Nabavi,et al.  Cancer molecular subtype classification by graph convolutional networks on multi-omics data , 2021, BCB.

[24]  E. Lin,et al.  Deep Learning with Neuroimaging and Genomics in Alzheimer’s Disease , 2021, International journal of molecular sciences.

[25]  May D. Wang,et al.  An Integrated Deep Network for Cancer Survival Prediction Using Omics Data , 2021, Frontiers in Big Data.

[26]  Yongchang Zheng,et al.  Multiomics metabolic and epigenetics regulatory network in cancer: A systems biology perspective. , 2021, Journal of genetics and genomics = Yi chuan xue bao.

[27]  Marie-Pier Scott-Boyer,et al.  Integration strategies of multi-omics data for machine learning analysis , 2021, Computational and structural biotechnology journal.

[28]  Fabian J Theis,et al.  Machine learning for perturbational single-cell omics. , 2021, Cell systems.

[29]  Karrar Hameed Abdulkareem,et al.  A Multi-agent Feature Selection and Hybrid Classification Model for Parkinson's Disease Diagnosis , 2021, ACM Trans. Multim. Comput. Commun. Appl..

[30]  Mazin Abed Mohammed,et al.  Innovative Artificial Intelligence Approach for Hearing-Loss Symptoms Identification Model Using Machine Learning Techniques , 2021, Sustainability.

[31]  Xiang Zhou,et al.  Integrating multi-omics data through deep learning for accurate cancer prognosis prediction , 2021, Comput. Biol. Medicine.

[32]  Feida Zhu,et al.  Protein deep profile and model predictions for identifying the causal genes of male infertility based on deep learning , 2021, Inf. Fusion.

[33]  C. Caltagirone,et al.  Multi-Layer Picture of Neurodegenerative Diseases: Lessons from the Use of Big Data through Artificial Intelligence , 2021, Journal of personalized medicine.

[34]  E. Trucco,et al.  Using machine learning approaches for multi-omics data analysis: A review. , 2021, Biotechnology advances.

[35]  S. A. Aghdam,et al.  Deep learning approaches for natural product discovery from plant endophytic microbiomes , 2021, Environmental microbiome.

[36]  J. Menéndez,et al.  Coupling Machine Learning and Lipidomics as a Tool to Investigate Metabolic Dysfunction-Associated Fatty Liver Disease. A General Overview , 2021, Biomolecules.

[37]  Xi Wang,et al.  Deep Learning in Head and Neck Tumor Multiomics Diagnosis and Analysis: Review of the Literature , 2021, Frontiers in Genetics.

[38]  Inderveer Chana,et al.  Computational Techniques and Tools for Omics Data Analysis: State-of-the-Art, Challenges, and Future Directions , 2021, Archives of Computational Methods in Engineering.

[39]  Lei Xie,et al.  A deep learning framework for high-throughput mechanism-driven phenotype compound screening and its application to COVID-19 drug repurposing , 2021, Nature Machine Intelligence.

[40]  Dong Wang,et al.  MNDR v3.0: mammal ncRNA–disease repository with increased coverage and annotation , 2020, Nucleic Acids Res..

[41]  David van Dijk,et al.  Gaining insight into SARS-CoV-2 infection and COVID-19 severity using self-supervised edge features and Graph Neural Networks , 2020, AAAI.

[42]  Mufti Mahmud,et al.  Deep Learning in Mining Biological Data , 2020, Cognitive Computation.

[43]  M. Tavallaei,et al.  Novel directions in data pre-processing and genome-wide association study (GWAS) methodologies to overcome ongoing challenges , 2021 .

[44]  Chihyun Park,et al.  Improved Prediction of Cancer Outcome Using Graph-Embedded Generative Adversarial Networks , 2021, IEEE Access.

[45]  M. Krassowski,et al.  State of the Field in Multi-Omics Research: From Computational Needs to Data Mining and Sharing , 2020, Frontiers in Genetics.

[46]  Chaoyang Zhang,et al.  A Review of Integrative Imputation for Multi-Omics Datasets , 2020, Frontiers in Genetics.

[47]  Jie Zheng,et al.  Ensemble learning models that predict surface protein abundance from single-cell multimodal omics data. , 2020, Methods.

[48]  Kwanjeera Wanichthanarak,et al.  Deep metabolome: Applications of deep learning in metabolomics , 2020, Computational and structural biotechnology journal.

[49]  Shuhong Zhao,et al.  A gene prioritization method based on a swine multi-omics knowledgebase and a deep learning model , 2020, Communications Biology.

[50]  Yang Zheng,et al.  Capsule Network Based Modeling of Multi-omics Data for Discovery of Breast Cancer-Related Genes , 2020, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[51]  Maren Hackenberg,et al.  Exploring generative deep learning for omics data using log-linear models , 2020, Bioinform..

[52]  Francesca Vitali,et al.  Integrated Multi-Omics Analyses in Oncology: A Review of Machine Learning Methods and Tools , 2020, Frontiers in Oncology.

[53]  Mohamed Jmaiel,et al.  Q-Rank: Reinforcement Learning for Recommending Algorithms to Predict Drug Sensitivity to Cancer Therapy , 2020, IEEE Journal of Biomedical and Health Informatics.

[54]  Tzong-Yi Lee,et al.  Incorporating deep learning and multi-omics autoencoding for analysis of lung adenocarcinoma prognostication , 2020, Comput. Biol. Chem..

[55]  Raghu Machiraju,et al.  Metabolomics and Multi-Omics Integration: A Survey of Computational Methods and Resources , 2020, Metabolites.

[56]  Patricia B. Munroe,et al.  Reaching the End-Game for GWAS: Machine Learning Approaches for the Prioritization of Complex Disease Loci , 2020, Frontiers in Genetics.

[57]  Haralambos Sarimveis,et al.  Transcriptomics in Toxicogenomics, Part III: Data Modelling for Risk Assessment , 2020, Nanomaterials.

[58]  Jonathan A. Tepper,et al.  A novel deep mining model for effective knowledge discovery from omics data , 2020, Artif. Intell. Medicine.

[59]  Tianwei Yu,et al.  forgeNet: a graph deep neural network model using tree-based ensemble classifiers for feature graph construction , 2020, Bioinform..

[60]  Nam D. Nguyen,et al.  Varmole: a biologically drop-connect deep neural network model for prioritizing disease risk variants and genes , 2020, bioRxiv.

[61]  XUE JIANG,et al.  A Generative Adversarial Network Model for Disease Gene Prediction With RNA-seq Data , 2020, IEEE Access.

[62]  Sanghyun Park,et al.  Prediction of Alzheimer's disease based on deep neural network by integrating gene expression and DNA methylation dataset , 2020, Expert Syst. Appl..

[63]  Guang-Zhong Yang,et al.  XAI—Explainable artificial intelligence , 2019, Science Robotics.

[64]  Kevin M. Mendez,et al.  The application of artificial neural networks in metabolomics: a historical perspective , 2019, Metabolomics.

[65]  S. Li,et al.  DNA Methylation Markers for Pan-Cancer Prediction by Deep Learning , 2019, Genes.

[66]  Ping Luo,et al.  Enhancing the prediction of disease-gene associations with multimodal deep learning , 2019, Bioinform..

[67]  Mikhail G Dozmorov,et al.  Disease classification: from phenotypic similarity to integrative genomics and beyond , 2019, Briefings Bioinform..

[68]  T. Akutsu,et al.  Convolutional neural network approach to lung cancer classification integrating protein interaction network and gene expression profiles. , 2019, Journal of bioinformatics and computational biology.

[69]  Hamido Fujita,et al.  Multi-view manifold regularized learning-based method for prioritizing candidate disease miRNAs , 2019, Knowl. Based Syst..

[70]  Benjamin Haibe-Kains,et al.  Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data , 2019, European Journal of Nuclear Medicine and Molecular Imaging.

[71]  Bairong Shen,et al.  Computer-aided biomarker discovery for precision medicine: data resources, models and applications , 2019, Briefings Bioinform..

[72]  Andrew J. Saykin,et al.  Deep Learning in Alzheimer's Disease: Diagnostic Classification and Prognostic Prediction Using Neuroimaging Data , 2019, Front. Aging Neurosci..

[73]  Wei Wang,et al.  Unsupervised classification of multi-omics data during cardiac remodeling using deep learning. , 2019, Methods.

[74]  Francisco Azuaje,et al.  Artificial intelligence for precision oncology: beyond patient stratification , 2019, npj Precision Oncology.

[75]  Bilal Mirza,et al.  Machine Learning and Integrative Analysis of Biomedical Big Data , 2019, Genes.

[76]  Li Wang,et al.  Lnc2Cancer v2.0: updated database of experimentally supported long non-coding RNAs in human cancers , 2018, Nucleic Acids Res..

[77]  Zhen Yang,et al.  LncRNADisease 2.0: an updated database of long non-coding RNA-associated diseases , 2018, Nucleic Acids Res..

[78]  Q. Zou,et al.  Deep learning in omics: a survey and guideline , 2018, Briefings in functional genomics.

[79]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2019 , 2018, Nucleic Acids Res..

[80]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[81]  Fei Wang,et al.  Deep learning for healthcare: review, opportunities and challenges , 2018, Briefings Bioinform..

[82]  Xiuqing Zhang,et al.  D-GPM: A Deep Learning Method for Gene Promoter Methylation Inference , 2018, bioRxiv.

[83]  M. Hutson Artificial intelligence faces reproducibility crisis. , 2018, Science.

[84]  Yasuhiro Date,et al.  Application of a Deep Neural Network to Metabolomics Studies and Its Performance in Determining Important Variables. , 2017, Analytical chemistry.

[85]  Seokjun Seo,et al.  Hybrid Approach of Relation Network and Localized Graph Convolutional Filtering for Breast Cancer Subtype Classification , 2017, IJCAI.

[86]  Thawfeek M. Varusai,et al.  The Reactome Pathway Knowledgebase , 2017, Nucleic acids research.

[87]  Michael Q. Zhang,et al.  NONCODEV5: a comprehensive annotation database for long non-coding RNAs , 2017, Nucleic Acids Res..

[88]  Jonathan D. Young,et al.  Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma , 2017, BMC Bioinformatics.

[89]  Núria Queralt-Rosinach,et al.  DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants , 2016, Nucleic Acids Res..

[90]  Francisco Azuaje,et al.  Computational models for predicting drug responses in cancer research , 2016, Briefings Bioinform..

[91]  E. Ashley Towards precision medicine , 2016, Nature Reviews Genetics.

[92]  Yanjun Qi,et al.  DeepChrome: deep-learning for predicting gene expression from histone modifications , 2016, Bioinform..

[93]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[94]  A. Vinci,et al.  Prognostic and predictive markers in pancreatic adenocarcinoma. , 2016, Digestive and liver disease : official journal of the Italian Society of Gastroenterology and the Italian Association for the Study of the Liver.

[95]  A. Shrivastava,et al.  C-reactive protein, inflammation and coronary heart disease , 2015 .

[96]  Dong Xu,et al.  Classification of lung cancer using ensemble-based feature selection and machine learning methods. , 2015, Molecular bioSystems.

[97]  B. Frey,et al.  The human splicing code reveals new insights into the genetic determinants of disease , 2015, Science.

[98]  François Schiettecatte,et al.  OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders , 2014, Nucleic Acids Res..

[99]  Jianlin Cheng,et al.  A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[100]  D. Haussler,et al.  The Somatic Genomic Landscape of Glioblastoma , 2013, Cell.

[101]  N. McGranahan,et al.  The causes and consequences of genetic heterogeneity in cancer evolution , 2013, Nature.

[102]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[103]  Rui Chen,et al.  Promise of personalized omics to precision medicine , 2013, Wiley interdisciplinary reviews. Systems biology and medicine.

[104]  Benjamin E. Gross,et al.  The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. , 2012, Cancer discovery.

[105]  K. Strimbu,et al.  What are biomarkers? , 2010, Current opinion in HIV and AIDS.

[106]  S. Mandel,et al.  Biomarkers for prediction and targeted prevention of Alzheimer’s and Parkinson’s diseases: evaluation of drug clinical efficacy , 2010, EPMA Journal.

[107]  R. Wilson,et al.  Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma. , 2010, Cancer cell.

[108]  Mousumi Debnath,et al.  Molecular Diagnostics: Promises and Possibilities , 2010 .

[109]  S. Gabriel,et al.  Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. , 2010, Cancer cell.

[110]  Thomas D. Wu,et al.  Molecular subclasses of high-grade glioma predict prognosis, delineate a pattern of disease progression, and resemble stages in neurogenesis. , 2006, Cancer cell.

[111]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[112]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.