Heterogeneous Graph Embeddings of Electronic Health Records Improve Critical Care Disease Predictions

Electronic Health Record (EHR) data is a rich source for powerful biomedical discovery but it consists of a wide variety of data types that are traditionally difficult to model. Furthermore, many machine learning frameworks that utilize these data for predictive tasks do not fully leverage the inter-connectivity structure and therefore may not be fully optimized. In this work, we propose a relational, deep heterogeneous network learning method that operates on EHR data and addresses these limitations. In this model, we used three different node types: patient, lab, and diagnosis. We show that relational graph learning naturally encodes structured relationships in the EHR and outperforms traditional feed forward models in the prediction of thousands of diseases. We evaluated our model on the EHR data derived from MIMIC-III, a public critical care data set, and show that our model has improved prediction of numerous diagnosis.

[1]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[2]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[3]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[4]  P. Kamath,et al.  A model to predict poor survival in patients undergoing transjugular intrahepatic portosystemic shunts , 2000, Hepatology.

[5]  Hui Xiong,et al.  Temporal Phenotyping from Longitudinal Electronic Health Records: A Graph Based Framework , 2015, KDD.

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Edward Choi,et al.  Graph Convolutional Transformer: Learning the Graphical Structure of Electronic Health Records , 2019, ArXiv.

[8]  Jimeng Sun,et al.  MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare , 2018, NeurIPS.

[9]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[10]  Fei Wang,et al.  Measuring Patient Similarities via a Deep Architecture with Medical Concept Embedding , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[11]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[12]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[13]  Kipp W. Johnson,et al.  The next generation of precision medicine: observational studies, electronic health records, biobanks and continuous monitoring. , 2018, Human molecular genetics.

[14]  Li Li,et al.  Automated disease cohort selection using word embeddings from Electronic Health Records , 2018, PSB.

[15]  Alessandro Tredicucci,et al.  Corrigendum: Universal lineshapes at the crossover between weak and strong critical coupling in Fano-resonant coupled oscillators , 2016, Scientific Reports.