Improving Medical Predictions by Irregular Multimodal Electronic Health Records Modeling

Health conditions among patients in intensive care units (ICUs) are monitored via electronic health records (EHRs), composed of numerical time series and lengthy clinical note sequences, both taken at irregular time intervals. Dealing with such irregularity in every modality, and integrating irregularity into multimodal representations to improve medical predictions, is a challenging problem. Our method first addresses irregularity in each single modality by (1) modeling irregular time series by dynamically incorporating hand-crafted imputation embeddings into learned interpolation embeddings via a gating mechanism, and (2) casting a series of clinical note representations as multivariate irregular time series and tackling irregularity via a time attention mechanism. We further integrate irregularity in multimodal fusion with an interleaved attention mechanism across temporal steps. To the best of our knowledge, this is the first work to thoroughly model irregularity in multimodalities for improving medical predictions. Our proposed methods for two medical prediction tasks consistently outperforms state-of-the-art (SOTA) baselines in each single modality and multimodal fusion scenarios. Specifically, we observe relative improvements of 6.5\%, 3.6\%, and 4.3\% in F1 for time series, clinical notes, and multimodal fusion, respectively. These results demonstrate the effectiveness of our methods and the importance of considering irregularity in multimodal EHRs.

[1]  F. Ahmad,et al.  Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences , 2022, ArXiv.

[2]  Alina Peluso,et al.  Unstructured clinical notes within the 24 hours since admission predict short, mid & long-term mortality in adult ICU patients , 2022, PloS one.

[3]  R. Callcut,et al.  Domain Adaptation for Trauma Mortality Prediction in EHRs with Feature Disparity , 2021, 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[4]  Yongshuai Hou,et al.  Machine Learning for Multimodal Electronic Health Records-based Research: Challenges and Perspectives , 2021, ArXiv.

[5]  Lijun Wu,et al.  How to leverage the multimodal EHR data for better medical prediction? , 2021, EMNLP.

[6]  M. Zitnik,et al.  Graph-Guided Network for Irregularly Sampled Multivariate Time Series , 2021, ICLR.

[7]  Sara Nouri Golmaei,et al.  DeepNote-GNN: predicting hospital readmission using clinical notes and patient network , 2021, ACM International Conference on Bioinformatics, Computational Biology and Biomedicine.

[8]  Peter Szolovits,et al.  A comprehensive EHR timeseries pre-training benchmark , 2021, CHIL.

[9]  Bo Zong,et al.  Dynamic Gaussian Mixture based Deep Generative Model For Robust Forecasting on Sparse Multivariate Time Series , 2021, AAAI.

[10]  Haiyang Yang,et al.  Multimodal temporal-clinical note network for mortality prediction , 2021, J. Biomed. Semant..

[11]  Andrew M. Dai,et al.  MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records , 2021, AAAI.

[12]  Anuradha Bhamidipaty,et al.  A Transformer-based Framework for Multivariate Time Series Representation Learning , 2020, KDD.

[13]  Elke A. Rundensteiner,et al.  Time-Aware Transformer-based Network for Clinical Notes Series Prediction , 2020, MLHC.

[14]  Jianfeng Gao,et al.  Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing , 2020, ACM Trans. Comput. Heal..

[15]  Louis-Philippe Morency,et al.  Integrating Multimodal Information in Large Pretrained Transformers , 2020, ACL.

[16]  Satya Narayan Shukla,et al.  Multi-Time Attention Networks for Irregularly Sampled Time Series , 2020, ICLR.

[17]  Xiaojun Chang,et al.  Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks , 2020, KDD.

[18]  Bryan Lim,et al.  Time-series forecasting with deep learning: a survey , 2020, Philosophical Transactions of the Royal Society A.

[19]  知秀 柴田 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[20]  Jes'us Villalba,et al.  Hierarchical Transformers for Long Document Classification , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[21]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[22]  Christian Bock,et al.  Set Functions for Time Series , 2019, ICML.

[23]  Satya Narayan Shukla,et al.  Interpolation-Prediction Networks for Irregularly Sampled Time Series , 2019, ICLR.

[24]  Shafiq R. Joty,et al.  Using Clinical Notes with Time Series Data for ICU Management , 2019, EMNLP.

[25]  Sanjay Thakur,et al.  Time2Vec: Learning a Vector Representation of Time , 2019, ArXiv.

[26]  Wenhu Chen,et al.  Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting , 2019, NeurIPS.

[27]  Ruslan Salakhutdinov,et al.  Multimodal Transformer for Unaligned Multimodal Language Sequences , 2019, ACL.

[28]  Sanja Fidler,et al.  Identifying Clinical Terms in Medical Text Using Ontology-Guided Machine Learning , 2019, JMIR medical informatics.

[29]  Ke Lin,et al.  Predicting in-hospital mortality of patients with acute kidney injury in the ICU using random forest model , 2019, Int. J. Medical Informatics.

[30]  Rajesh Ranganath,et al.  ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission , 2019, ArXiv.

[31]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[32]  D. Stein,et al.  ICU Management of Trauma Patients , 2018, Critical care medicine.

[33]  Germain Forestier,et al.  Deep learning for time series classification: a review , 2018, Data Mining and Knowledge Discovery.

[34]  Priyanka Gupta,et al.  Transfer Learning for Clinical Time Series Analysis using Recurrent Neural Networks , 2018, ArXiv.

[35]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[36]  Jimeng Sun,et al.  Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review , 2018, J. Am. Medical Informatics Assoc..

[37]  Louis-Philippe Morency,et al.  Efficient Low-rank Multimodal Fusion With Modality-Specific Factors , 2018, ACL.

[38]  Fei Wang,et al.  Patient Subtyping via Time-Aware LSTM Networks , 2017, KDD.

[39]  Erik Cambria,et al.  Tensor Fusion Network for Multimodal Sentiment Analysis , 2017, EMNLP.

[40]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[41]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[42]  Aram Galstyan,et al.  Multitask learning and benchmarking with clinical time series data , 2017, Scientific Data.

[43]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[44]  David C. Kale,et al.  Directly Modeling Missing Data in Sequences with RNNs: Improved Classification of Clinical Time Series , 2016, MLHC.

[45]  Yan Liu,et al.  Recurrent Neural Networks for Multivariate Time Series with Missing Values , 2016, Scientific Reports.

[46]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[47]  Julia Adler-Milstein,et al.  Electronic Health Record Adoption In US Hospitals: Progress Continues, But Challenges Persist. , 2015, Health affairs.

[48]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[49]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[50]  Bekele Afessa,et al.  Severity of illness and organ failure assessment in adult intensive care units. , 2007, Critical care clinics.

[51]  Corinne Alberti,et al.  Epidemiology of sepsis and infection in ICU patients from an international multicentre cohort study , 2002, Intensive Care Medicine.

[52]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.

[53]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[54]  Iman Deznabi,et al.  Predicting in-hospital mortality by combining clinical notes with time-series data , 2021, FINDINGS.

[55]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[56]  David Duvenaud,et al.  Latent Ordinary Differential Equations for Irregularly-Sampled Time Series , 2019, NeurIPS.

[57]  Rémi Louf,et al.  Transformers : State-ofthe-art Natural Language Processing , 2019 .

[58]  J. Maderuelo-Fernandez,et al.  [Preventable adverse drug events in hospitalized patients]. , 2006, Medicina clinica.

[59]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[60]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.