Representation Learning of EHR Data via Graph-Based Medical Entity Embedding

Automatic representation learning of key entities in electronic health record (EHR) data is a critical step for healthcare informatics that turns heterogeneous medical records into structured and actionable information. Here we propose ME2Vec, an algorithmic framework for learning low-dimensional vectors of the most common entities in EHR: medical services, doctors, and patients. ME2Vec leverages diverse graph embedding techniques to cater for the unique characteristic of each medical entity. Using real-world clinical data, we demonstrate the efficacy of ME2Vec over competitive baselines on disease diagnosis prediction.

[1]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[2]  Chenglin Miao,et al.  Uncorrelated Patient Similarity Learning , 2018, SDM.

[3]  Beng Chin Ooi,et al.  Medical Concept Embedding with Time-Aware Attention , 2018, IJCAI.

[4]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[5]  Fenglong Ma,et al.  Multi-task Sparse Metric Learning for Monitoring Patient Similarity Progression , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[6]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[9]  Yujia Li,et al.  Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer , 2020, AAAI.

[10]  Anis Sharafoddini,et al.  Patient Similarity in Prediction Models Based on Health Data: A Scoping Review , 2017, JMIR medical informatics.

[11]  Jimeng Sun,et al.  MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare , 2018, NeurIPS.

[12]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[13]  Edward Choi,et al.  Graph Convolutional Transformer: Learning the Graphical Structure of Electronic Health Records , 2019, ArXiv.

[14]  Fei Wang,et al.  Measuring Patient Similarities via a Deep Architecture with Medical Concept Embedding , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[15]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.