EHR Coding with Multi-scale Feature Attention and Structured Knowledge Graph Propagation

Assigning standard medical codes (e.g., ICD-9-CM) representing diagnoses or procedures to electronic health record (EHR) is an important task in the medical domain. However, automatic coding is difficult since the clinical note is composed of multiple long and heterogeneous textual narratives (e.g., discharge diagnosis, pathology reports, surgical procedure notes). Furthermore, the code label space is large and the label distribution is extremely unbalanced. The state-of-the-art methods mainly regard EHR coding as a multi-label text classification task and use shallow convolution neural network with fixed window size, which is incapable of learning variable n-gram features and the ontology structure between codes. In this paper, we leverage a densely connected convolutional neural network which is able to produce variable n-gram features for clinical note feature learning. We also incorporate a multi-scale feature attention to adaptively select multi-scale features since the most informative n-grams in clinical notes for each word can vary in length according to the neighborhood. Furthermore, we leverage graph convolutional neural network to capture both the hierarchical relationships among medical codes and the semantics of each code. Finally, We validate our method on the public dataset, and the evaluation results indicate that our method can significantly outperform other state-of-the-art models.

[1]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[2]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[3]  David Sontag,et al.  Grounded Recurrent Neural Networks , 2017, ArXiv.

[4]  David B. Dunson,et al.  Robust and Scalable Bayes via a Median of Subset Posterior Measures , 2014, J. Mach. Learn. Res..

[5]  Yuan Lu,et al.  An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records , 2015, Artif. Intell. Medicine.

[6]  Berthier A. Ribeiro-Neto,et al.  A hierarchical approach to the automatic categorization of medical documents , 1998, CIKM '98.

[7]  Oladimeji Farri,et al.  Condensed Memory Networks for Clinical Diagnostic Inferencing , 2016, AAAI.

[8]  Walter Daelemans,et al.  Selecting relevant features from the electronic health record for clinical code prediction , 2017, J. Biomed. Informatics.

[9]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[10]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[11]  K. Luyckx,et al.  Data integration of structured and unstructured sources for assigning clinical codes to patient stays , 2015, J. Am. Medical Informatics Assoc..

[12]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[13]  Pengtao Xie,et al.  Multimodal Machine Learning for Automated ICD Coding , 2018, MLHC.

[14]  O. N. Garcia,et al.  Knowledge and Data Engineering: An Outlook , 1989 .

[15]  Guoyin Wang,et al.  Joint Embedding of Words and Labels for Text Classification , 2018, ACL.

[16]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[17]  Pengtao Xie,et al.  A Neural Architecture for Automated ICD Coding , 2018, ACL.

[18]  Noémie Elhadad,et al.  Multi-Label Classification of Patient Notes: Case Study on ICD Code Assignment , 2018, AAAI Workshops.

[19]  Pengtao Xie,et al.  A Neural Architecture for Automated ICD Coding , 2017, ACL.

[20]  Frank D. Wood,et al.  Diagnosis code assignment: models and evaluation metrics , 2013, J. Am. Medical Informatics Assoc..

[21]  Lei Li,et al.  Enhancing Automatic ICD-9-CM Code Assignment for Medical Texts with PubMed , 2017, BioNLP.

[22]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[23]  Ramakanth Kavuluru,et al.  EMR Coding with Semi-Parametric Multi-Head Matching Networks , 2018, NAACL.

[24]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[25]  Lina Yao,et al.  Diagnosis Code Assignment Using Sparsity-Based Disease Correlation Embedding , 2016, IEEE Transactions on Knowledge and Data Engineering.

[26]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[27]  Ameet Talwalkar,et al.  Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , 2016, J. Mach. Learn. Res..

[28]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[29]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.