Representation Learning for Electronic Health Records

Information in electronic health records (EHR), such as clinical narratives, examination reports, lab measurements, demographics, and other patient encounter entries, can be transformed into appropriate data representations that can be used for downstream clinical machine learning tasks using representation learning. Learning better representations is critical to improve the performance of downstream tasks. Due to the advances in machine learning, we now can learn better and meaningful representations from EHR through disentangling the underlying factors inside data and distilling large amounts of information and knowledge from heterogeneous EHR sources. In this chapter, we first introduce the background of learning representations and reasons why we need good EHR representations in machine learning for medicine and healthcare in Section 1. Next, we explain the commonly-used machine learning and evaluation methods for representation learning using a deep learning approach in Section 2. Following that, we review recent related studies of learning patient state representation from EHR for clinical machine learning tasks in Section 3. Finally, in Section 4 we discuss more techniques, studies, and challenges for learning natural language representations when free texts, such as clinical notes, examination reports, or biomedical literature are used. We also discuss challenges and opportunities in these rapidly growing research fields.

[1]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[2]  Peter Szolovits,et al.  Representation and Reinforcement Learning for Personalized Glycemic Control in Septic Patients , 2017, ArXiv.

[3]  Svetha Venkatesh,et al.  DeepCare: A Deep Dynamic Memory Model for Predictive Medicine , 2016, PAKDD.

[4]  Peter Szolovits,et al.  Categorical and Probabilistic Reasoning in Medical Diagnosis , 1990, Artif. Intell..

[5]  Svetha Venkatesh,et al.  Learning vector representation of medical objects via EMR-driven nonnegative restricted Boltzmann machines (eNRBM) , 2015, J. Biomed. Informatics.

[6]  Peter Szolovits,et al.  Unsupervised Multimodal Representation Learning across Medical Images and Reports , 2018, ArXiv.

[7]  Hongfang Liu,et al.  A Comparison of Word Embeddings for the Biomedical Natural Language Processing , 2018, J. Biomed. Informatics.

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[10]  Anna Rumshisky,et al.  CliNER 2.0: Accessible and Accurate Clinical Concept Extraction , 2018, ArXiv.

[11]  Todd R. Johnson,et al.  Retrofitting Word Vectors of MeSH Terms to Improve Semantic Similarity Measures , 2016, Louhi@EMNLP.

[12]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[13]  T M Therneau,et al.  A model to predict survival in patients with end‐stage liver disease , 2001, Hepatology.

[14]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[15]  Jimeng Sun,et al.  Using recurrent neural network models for early detection of heart failure onset , 2016, J. Am. Medical Informatics Assoc..

[16]  Yun Liu,et al.  How to develop machine learning models for healthcare , 2019, Nature Materials.

[17]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[18]  Jimeng Sun,et al.  Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review , 2018, J. Am. Medical Informatics Assoc..

[19]  Fei Wang,et al.  Readmission prediction via deep contextual embedding of clinical concepts , 2018, PloS one.

[20]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[21]  Franck Dernoncourt,et al.  NeuroNER: an easy-to-use program for named-entity recognition based on neural networks , 2017, EMNLP.

[22]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[23]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[24]  Fei Wang,et al.  Patient Subtyping via Time-Aware LSTM Networks , 2017, KDD.

[25]  Alexa T. McCray,et al.  An Upper-Level Ontology for the Biomedical Domain , 2003, Comparative and functional genomics.

[26]  Steven Horng,et al.  Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning , 2017, PloS one.

[27]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[28]  Peter Szolovits,et al.  Predicting Clinical Outcomes Across Changing Electronic Health Record Systems , 2017, KDD.

[29]  Fei Wang,et al.  Deep learning for healthcare: review, opportunities and challenges , 2018, Briefings Bioinform..

[30]  James R. Glass,et al.  Speech2Vec: A Sequence-to-Sequence Framework for Learning Word Embeddings from Speech , 2018, INTERSPEECH.

[31]  Peter Szolovits,et al.  Predicting Blood Pressure Response to Fluid Bolus Therapy Using Attention-Based Neural Networks for Clinical Interpretability , 2018, ArXiv.

[32]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[33]  Adler J. Perotte,et al.  Learning probabilistic phenotypes from heterogeneous EHR data , 2015, J. Biomed. Informatics.

[34]  Jimeng Sun,et al.  MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare , 2018, NeurIPS.

[35]  Regina Barzilay,et al.  Rationalizing Neural Predictions , 2016, EMNLP.

[36]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.

[37]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[38]  Peter Szolovits,et al.  Prognostic Physiology: Modeling Patient Severity in Intensive Care Units Using Radial Domain Folding , 2012, AMIA.

[39]  Jingqi Wang,et al.  Enhancing Clinical Concept Extraction with Contextual Embedding , 2019, J. Am. Medical Informatics Assoc..

[40]  Matthias Samwald,et al.  Exploring the Application of Deep Learning Techniques on Medical Text Corpora , 2014, MIE.

[41]  William T. Abraham,et al.  Risk stratification for in-hospital mortality in acutely decompensated heart failure. Classification and regression tree analysis , 2005 .

[42]  Nigam H. Shah,et al.  Learning Effective Representations from Clinical Notes , 2017, ArXiv.

[43]  Michael V. McConnell,et al.  Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning , 2017, Nature Biomedical Engineering.

[44]  Peter Szolovits,et al.  What Is a Knowledge Representation? , 1993, AI Mag..

[45]  Guido Zuccon,et al.  Medical Semantic Similarity with a Neural Language Model , 2014, CIKM.

[46]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[47]  Peter Szolovits,et al.  Subgraph augmented non-negative tensor factorization (SANTF) for modeling clinical narrative text , 2015, J. Am. Medical Informatics Assoc..

[48]  Peter Szolovits,et al.  Clinical Intervention Prediction and Understanding with Deep Neural Networks , 2017, MLHC.

[49]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[50]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[51]  Rajesh Ranganath,et al.  ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission , 2019, ArXiv.

[52]  Anna Rumshisky,et al.  Unfolding physiological state: mortality modelling in intensive care units , 2014, KDD.

[53]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[54]  Li-Wei H. Lehman,et al.  Representation Learning Approaches to Detect False Arrhythmia Alarms from ECG Dynamics , 2018, MLHC.

[55]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[56]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[57]  G. Corrado,et al.  Using a Deep Learning Algorithm and Integrated Gradients Explanation to Assist Grading for Diabetic Retinopathy. , 2019, Ophthalmology.

[58]  Jenna Wiens,et al.  A study in transfer learning: leveraging data from multiple hospitals to enhance hospital-specific predictions , 2014, J. Am. Medical Informatics Assoc..

[59]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[60]  Nilmini Wickramasinghe,et al.  Deepr: A Convolutional Net for Medical Records , 2016, ArXiv.

[61]  Yujia Li,et al.  Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer , 2020, AAAI.

[62]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[63]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[64]  Olivier Bodenreider,et al.  Aggregating UMLS Semantic Types for Reducing Conceptual Complexity , 2001, MedInfo.

[65]  Andrew H. Beck,et al.  Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer , 2017, JAMA.

[66]  Hongfang Liu,et al.  CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines , 2017, J. Am. Medical Informatics Assoc..

[67]  Peter Szolovits,et al.  Predicting intervention onset in the ICU with switching state space models , 2017, CRI.

[68]  Timo Kohlberger,et al.  An augmented reality microscope with real-time artificial intelligence integration for cancer diagnosis , 2019, Nature Medicine.

[69]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[70]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[71]  Martha J. Radford,et al.  Validation of Clinical Classification Schemes for Predicting Stroke: Results From the National Registry of Atrial Fibrillation , 2001 .

[72]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[73]  David F. Steiner,et al.  Impact of Deep Learning Assistance on the Histopathologic Review of Lymph Nodes for Metastatic Breast Cancer , 2018, The American journal of surgical pathology.

[74]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[75]  Shuying Shen,et al.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text , 2011, J. Am. Medical Informatics Assoc..

[76]  Angela Lin,et al.  Multimodal Multitask Representation Learning for Pathology Biobank Metadata Prediction , 2019, ArXiv.

[77]  Peter Szolovits,et al.  A Multivariate Timeseries Modeling Approach to Severity of Illness Assessment and Forecasting in ICU with Sparse, Heterogeneous Clinical Data , 2015, AAAI.

[78]  Jimeng Sun,et al.  Clinical Concept Extraction for Document-Level Coding , 2019, BioNLP@ACL.

[79]  Yan Liu,et al.  Deep Computational Phenotyping , 2015, KDD.

[80]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[81]  Nigam H. Shah,et al.  Building the graph of medicine from millions of clinical narratives , 2014, Scientific Data.

[82]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[83]  James R. Glass,et al.  Towards Unsupervised Speech-to-text Translation , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[84]  Willie Boag,et al.  AWE-CM Vectors: Augmenting Word Embeddings with a Clinical Metathesaurus , 2017, ArXiv.

[85]  Subhashini Venugopalan,et al.  Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. , 2016, JAMA.

[86]  Rema Padman,et al.  A Deep Learning Architecture for De-identification of Patient Notes: Implementation and Evaluation , 2018, ArXiv.

[87]  Tianxi Cai,et al.  Clinical Concept Embeddings Learned from Massive Sources of Medical Data , 2018, ArXiv.

[88]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[89]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[90]  Omer Levy,et al.  Dependency-Based Word Embeddings , 2014, ACL.

[91]  J. Henry,et al.  Adoption of Electronic Health Record Systems among U . S . Non-Federal Acute Care Hospitals : 2008-2015 , 2013 .

[92]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[93]  FutomaJoseph,et al.  A comparison of models for predicting early hospital readmissions , 2015 .

[94]  Lin-Shan Lee,et al.  Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder , 2016, INTERSPEECH.

[95]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[96]  Regina Barzilay,et al.  Using Machine Learning to Parse Breast Pathology Reports , 2016 .

[97]  Wei-Hung Weng,et al.  Learning Deep Representations of Medical Images using Siamese CNNs with Application to Content-Based Image Retrieval , 2017, ArXiv.

[98]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[99]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[100]  Fei Wang,et al.  An RNN Architecture with Dynamic Temporal Matching for Personalized Predictions of Parkinson's Disease , 2017, SDM.

[101]  David Sontag,et al.  Learning Low-Dimensional Representations of Medical Concepts , 2016, CRI.

[102]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[103]  Ping Zhang,et al.  Risk Prediction with Electronic Health Records: A Deep Learning Approach , 2016, SDM.

[104]  Kenneth Jung,et al.  Effective Representations of Clinical Notes , 2017 .

[105]  Finale Doshi-Velez,et al.  Comorbidity Clusters in Autism Spectrum Disorders: An Electronic Health Record Time-Series Analysis , 2014, Pediatrics.

[106]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[107]  Peter Szolovits,et al.  Clinically Accurate Chest X-Ray Report Generation , 2019, MLHC.

[108]  Peter Szolovits,et al.  Artificial Intelligence in Medicine , 1982 .

[109]  Peter Szolovits,et al.  Unsupervised Clinical Language Translation , 2019, KDD.

[110]  Uri Shalit,et al.  Learning Representations for Counterfactual Inference , 2016, ICML.

[111]  J. L. Gall,et al.  APACHE II--a severity of disease classification system. , 1986, Critical care medicine.

[112]  Aleksey Boyko,et al.  Detecting Cancer Metastases on Gigapixel Pathology Images , 2017, ArXiv.

[113]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[114]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[115]  Aldo A. Faisal,et al.  The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care , 2018, Nature Medicine.

[116]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[117]  Peter Szolovits,et al.  Continuous State-Space Models for Optimal Sepsis Treatment: a Deep Reinforcement Learning Approach , 2017, MLHC.

[118]  G. Eknoyan,et al.  Definition and classification of chronic kidney disease: a position statement from Kidney Disease: Improving Global Outcomes (KDIGO). , 2005, Kidney international.

[119]  editor-in-chief Mark H. Beers,et al.  The Merck manual , 2012 .

[120]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[121]  Kavishwar B. Wagholikar,et al.  Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach , 2017, BMC Medical Informatics and Decision Making.

[122]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[123]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[124]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[125]  J. Vincent,et al.  The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure , 1996, Intensive Care Medicine.

[126]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[127]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[128]  Eric J Topol,et al.  High-performance medicine: the convergence of human and artificial intelligence , 2019, Nature Medicine.

[129]  T. Lasko,et al.  Computational Phenotype Discovery Using Unsupervised Feature Learning over Noisy, Sparse, and Irregular Clinical Data , 2013, PloS one.

[130]  Sunghwan Sohn,et al.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications , 2010, J. Am. Medical Informatics Assoc..

[131]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[132]  Ioannis Ch. Paschalidis,et al.  Clinical Concept Extraction with Contextual Word Embedding , 2018, NIPS 2018.

[133]  Ben J. Marafino,et al.  Research and applications: N-gram support vector machines for scalable procedure and diagnosis classification, with applications to clinical free text data from the intensive care unit , 2014, J. Am. Medical Informatics Assoc..

[134]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[135]  Edward Choi,et al.  Graph Convolutional Transformer: Learning the Graphical Structure of Electronic Health Records , 2019, ArXiv.

[136]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[137]  Andrew L. Beam,et al.  Learning Contextual Hierarchical Structure of Medical Concepts with Poincairé Embeddings to Clarify Phenotypes , 2018, PSB.

[138]  Peter Szolovits,et al.  Mapping Unparalleled Clinical Professional and Consumer Languages with Embedding Alignment , 2018, ArXiv.

[139]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[140]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[141]  Ellery Wulczyn,et al.  Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer , 2018, npj Digital Medicine.

[142]  James R. Glass,et al.  Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces , 2018, NeurIPS.

[143]  Tapio Salakoski,et al.  Distributional Semantics Resources for Biomedical Text Processing , 2013 .

[144]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[145]  Joseph Futoma,et al.  A comparison of models for predicting early hospital readmissions , 2015, J. Biomed. Informatics.

[146]  Jimeng Sun,et al.  Multi-layer Representation Learning for Medical Concepts , 2016, KDD.