MIMO: Mutual Integration of Patient Journey and Medical Ontology for Healthcare Representation Learning

Healthcare representation learning on the Electronic Health Record (EHR) is seen as crucial for predictive analytics in the medical field. Many natural language processing techniques, such as word2vec, RNN and self-attention, have been adapted for use in hierarchical and time stamped EHR data, but fail when they lack either general or task-specific data. Hence, some recent works train healthcare representations by incorporating medical ontology (a.k.a. knowledge graph), by self-supervised tasks like diagnosis prediction, but (1) the small-scale, monotonous ontology is insufficient for robust learning, and (2) critical contexts or dependencies underlying patient journeys are never exploited to enhance ontology learning. To address this, we propose an end-to-end robust Transformer-based solution, Mutual Integration of patient journey and Medical Ontology (MIMO) for healthcare representation learning and predictive analytics. Specifically, it consists of task-specific representation learning and graph-embedding modules to learn both patient journey and medical ontology interactively. Consequently, this creates a mutual integration to benefit both healthcare representation learning and medical ontology embedding. Moreover, such integration is achieved by a joint training of both task-specific predictive and ontology-based disease typing tasks based on fused embeddings of the two modules. Experiments conducted on two real-world diagnosis prediction datasets show that, our healthcare representation model MIMO not only achieves better predictive results than previous state-of-the-art approaches regardless of sufficient or insufficient training data, but also derives more interpretable embeddings of diagnoses.

[1]  Fenglong Ma,et al.  KAME: Knowledge-based Attention Model for Diagnosis Prediction in Healthcare , 2018, CIKM.

[2]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[3]  Chun-Hsiang Chuang,et al.  Forehead EEG in Support of Future Feasible Personal Healthcare Solutions: Sleep Management, Headache Prevention, and Depression Treatment , 2017, IEEE Access.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[6]  M. A. Ganaie,et al.  Predicting Brain Age Using Machine Learning Algorithms: A Comprehensive Evaluation , 2021, IEEE Journal of Biomedical and Health Informatics.

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[9]  M. A. Ganaie,et al.  Classification of Alzheimer’s Disease Using Ensemble of Deep Neural Networks Trained Through Transfer Learning , 2021, IEEE Journal of Biomedical and Health Informatics.

[10]  Avishek Anand,et al.  A Survey on Healthcare Data: A Security Perspective , 2021, ACM Trans. Multim. Comput. Commun. Appl..

[11]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[12]  Kazem Rahimi,et al.  BEHRT: Transformer for Electronic Health Records , 2019, Scientific Reports.

[13]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[14]  Tao Shen,et al.  Temporal Self-Attention Network for Medical Concept Embedding , 2019, 2019 IEEE International Conference on Data Mining (ICDM).

[15]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[16]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[17]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[18]  Fei Wang,et al.  Patient Subtyping via Time-Aware LSTM Networks , 2017, KDD.

[19]  H. Nguyen,et al.  Identification of EEG Dynamics During Freezing of Gait and Voluntary Stopping in Patients With Parkinson’s Disease , 2021, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[20]  Fenglong Ma,et al.  Risk Prediction on Electronic Health Records with Prior Medical Knowledge , 2018, KDD.

[21]  Fei Li,et al.  Fine-Tuning Bidirectional Encoder Representations From Transformers (BERT)–Based Models on Large-Scale Electronic Health Record Notes: An Empirical Study , 2019, JMIR medical informatics.

[22]  Alistair E. W. Johnson,et al.  The eICU Collaborative Research Database, a freely available multi-center database for critical care research , 2018, Scientific Data.

[23]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[24]  Nilmini Wickramasinghe,et al.  Deepr: A Convolutional Net for Medical Records , 2016, ArXiv.

[25]  Slobodan Vucetic,et al.  Improving Medical Code Prediction from Clinical Text via Incorporating Online Knowledge Sources , 2019, WWW.

[26]  Yang Liu,et al.  Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention , 2016, ArXiv.

[27]  Jie Zhai,et al.  Representation learning for clinical time series prediction tasks in electronic health records , 2019, BMC Medical Informatics Decis. Mak..

[28]  Chin-Teng Lin,et al.  Current trends of granular data mining for biomedical data analysis , 2020, Inf. Sci..

[29]  Amit Kumar Singh,et al.  Mobile cloud-assisted paradigms for management of multimedia big data in healthcare systems: Research challenges and opportunities , 2019, International Journal of Information Management.

[30]  Qinghua Zheng,et al.  An Interpretable Fast Model for Predicting The Risk of Heart Failure , 2019, SDM.

[31]  Svetha Venkatesh,et al.  DeepCare: A Deep Dynamic Memory Model for Predictive Medicine , 2016, PAKDD.

[32]  Muhammad Tanveer,et al.  Diagnosis of Alzheimer's disease using universum support vector machine based recursive feature elimination (USVM-RFE) , 2020, Biomed. Signal Process. Control..

[33]  Jimeng Sun,et al.  Multi-layer Representation Learning for Medical Concepts , 2016, KDD.

[34]  Rajesh Ranganath,et al.  ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission , 2019, ArXiv.

[35]  Tao Shen,et al.  Self-Attention Enhanced Patient Journey Understanding in Healthcare System , 2020, ECML/PKDD.

[36]  Fenglong Ma,et al.  Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks , 2017, KDD.

[37]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[38]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[39]  Zhendong Niu,et al.  Attentive Dual Embedding for Understanding Medical Concepts in Electronic Health Records , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[40]  Andreas Spanias,et al.  Attend and Diagnose: Clinical Time Series Analysis using Attention Models , 2017, AAAI.

[41]  Zhe Zhao,et al.  K-BERT: Enabling Language Representation with Knowledge Graph , 2019, AAAI.

[42]  Chang Liu,et al.  Multilevel Self-Attention Model and Its Use on Medical Risk Prediction , 2019, PSB.

[43]  Deevakar Rogith,et al.  Generating sequential electronic health records using dual adversarial autoencoder , 2020, J. Am. Medical Informatics Assoc..

[44]  Jimeng Sun,et al.  StageNet: Stage-Aware Neural Networks for Health Risk Prediction , 2020, WWW.

[45]  I. Razzak,et al.  Depth-wise dense neural network for automatic COVID19 infection detection and diagnosis , 2021, Annals of operations research.

[46]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[47]  Cao Xiao,et al.  Longitudinal Adversarial Attack on Electronic Health Records Data , 2019, WWW.

[48]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[49]  Benjamin C. M. Fung,et al.  Medical Concept Embedding with Multiple Ontological Representations , 2019, IJCAI.