Temporal Self-Attention Network for Medical Concept Embedding

In longitudinal electronic health records (EHRs), the event records of a patient are distributed over a long period of time and the temporal relations between the events reflect sufficient domain knowledge to benefit prediction tasks such as the rate of inpatient mortality. Medical concept embedding as a feature extraction method that transforms a set of medical concepts with a specific time stamp into a vector, which will be fed into a supervised learning algorithm. The quality of the embedding significantly determines the learning performance over the medical data. In this paper, we propose a medical concept embedding method based on applying a self-attention mechanism to represent each medical concept. We propose a novel attention mechanism which captures the contextual information and temporal relationships between medical concepts. A light-weight neural net, "Temporal Self-Attention Network (TeSAN)", is then proposed to learn medical concept embedding based solely on the proposed attention mechanism. To test the effectiveness of our proposed methods, we have conducted clustering and prediction tasks on two public EHRs datasets comparing TeSAN against five state-of-the-art embedding methods. The experimental results demonstrate that the proposed TeSAN model is superior to all the compared methods. To the best of our knowledge, this work is the first to exploit temporal self-attentive relations between medical events.

[1]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[2]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[3]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[4]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[5]  John Kellett,et al.  Validation of an abbreviated Vitalpac™ Early Warning Score (ViEWS) in 75,419 consecutive admissions to a Canadian regional hospital. , 2012, Resuscitation.

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[9]  Guido Zuccon,et al.  Medical Semantic Similarity with a Neural Language Model , 2014, CIKM.

[10]  Matthias Samwald,et al.  Exploring the Application of Deep Learning Techniques on Medical Text Corpora , 2014, MIE.

[11]  Xiaowu Sun,et al.  Using electronic health record data to develop inpatient mortality predictive model: Acute Laboratory Risk of Mortality Score (ALaRMS) , 2013, J. Am. Medical Informatics Assoc..

[12]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[13]  Svetha Venkatesh,et al.  Learning vector representation of medical objects via EMR-driven nonnegative restricted Boltzmann machines (eNRBM) , 2015, J. Biomed. Informatics.

[14]  Ramón Fernández Astudillo,et al.  Not All Contexts Are Created Equal: Better Word Representations with Variable Attention , 2015, EMNLP.

[15]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[16]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[17]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[18]  Hideo Yasunaga,et al.  Procedure-based severity index for inpatients: development and validation using administrative database , 2015, BMC Health Services Research.

[19]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[21]  Jimeng Sun,et al.  Multi-layer Representation Learning for Medical Concepts , 2016, KDD.

[22]  Siddharth Patwardhan,et al.  The Role of Context Types and Dimensionality in Learning Word Embeddings , 2016, NAACL.

[23]  David Sontag,et al.  Learning Low-Dimensional Representations of Medical Concepts , 2016, CRI.

[24]  Yang Liu,et al.  Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention , 2016, ArXiv.

[25]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[26]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[27]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[28]  Lina Yao,et al.  Learning Multiple Diagnosis Codes for ICU Patients with Local Disease Correlation Mining , 2017, ACM Trans. Knowl. Discov. Data.

[29]  Reinforced Mnemonic Reader for Machine Comprehension , 2017, 1705.02798.

[30]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[31]  David M. Blei,et al.  Context Selection for Embedding Models , 2017, NIPS.

[32]  Chengqi Zhang,et al.  Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling , 2018, ICLR.

[33]  Aidong Zhang,et al.  Interpretable Word Embeddings for Medical Domain , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[34]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[35]  Yu Zhang,et al.  Hierarchical Attention Transfer Network for Cross-Domain Sentiment Classification , 2018, AAAI.

[36]  Fei Wang,et al.  Integrative Analysis of Patient Health Records and Neuroimages via Memory-Based Graph Convolutional Network , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[37]  Il-Chul Moon,et al.  Diagnosis Prediction via Medical Context Attention Networks Using Deep Generative Modeling , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[38]  Beng Chin Ooi,et al.  Medical Concept Embedding with Time-Aware Attention , 2018, IJCAI.

[39]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[40]  Tao Shen,et al.  DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.

[41]  Jimeng Sun,et al.  MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare , 2018, NeurIPS.

[42]  Ming Zhou,et al.  Reinforced Mnemonic Reader for Machine Reading Comprehension , 2017, IJCAI.

[43]  Lina Yao,et al.  Prototype Propagation Networks (PPN) for Weakly-supervised Few-shot Learning on Category Graph , 2019, IJCAI.