Automatic ICD-10 Classification of Diseases from Dutch Discharge Letters

The international classification of diseases (ICD) is a widely used tool to describe patient diagnoses. At University Medical Center Utrecht (UMCU), for example, trained medical coders translate information from hospital discharge letters into ICD-10 codes for research and national disease epidemiology statistics, at considerable cost. To mitigate these costs, automatic ICD coding from discharge letters would be useful. However, this task has proven challenging in practice: it is a multi-label task with a large number of very sparse categories, presented in a hierarchical structure. Moreover, existing ICD systems have been benchmarked only on relatively easier versions of this task, such as single-label performance and performance on the higher “chapter” level of the ICD hierarchy, which contains fewer categories. In this study, we benchmark the state-of-the-art ICD classification systems and two baseline systems on a large dataset constructed from Dutch cardiology discharge letters a t UMCU hospital. Performance of all systems is evaluated for both the easier chapter-level ICD codes and single-label version of the task found in the literature, as well as for the lower-level ICD hierarchy and multi-label task that is needed in practice. We find that state-of-the-art methods outperform the baseline for the single-label version of the task only. For the multi-label task, the baselines are not defeated by any state-of-the-art system, with the exception of HA-GRU, which does perform best in the most difficult task on accuracy. We conclude that practical performance may have been somewhat overstated in the literature, although deep learning techniques are sufficiently good to complement, though not replace, human ICD coding in our application.

[1]  S. Sangeetha,et al.  SECNLP: A Survey of Embeddings in Clinical Natural Language Processing , 2019, J. Biomed. Informatics.

[2]  Zhiyong Lu,et al.  ML-Net: multi-label classification of biomedical texts with deep neural networks , 2018, J. Am. Medical Informatics Assoc..

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Huijuan Lu,et al.  Automatic ICD-10 coding algorithm using an improved longest common subsequence based on semantic similarity , 2017, PloS one.

[5]  Marco Spruit,et al.  DEDUCE: A pattern matching method for automatic de-identification of Dutch medical text , 2017, Telematics Informatics.

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Philip Resnik,et al.  Assigning Medical Codes at the Encounter Level by Paying Attention to Documents , 2019, ArXiv.

[8]  Pierre Zweigenbaum,et al.  Automatic classification of free-text medical causes from death certificates for reactive mortality surveillance in France , 2019, Int. J. Medical Informatics.

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Anthony N. Nguyen,et al.  Computer-Assisted Diagnostic Coding: Effectiveness of an NLP-based approach using SNOMED CT to ICD-10 mappings , 2018, AMIA.

[11]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[12]  David Robinson,et al.  tidytext: Text Mining and Analysis Using Tidy Data Principles in R , 2016, J. Open Source Softw..

[13]  Ingo Feinerer Introduction to the tm Package Text Mining in R , 2007 .

[14]  Priyanka Nigam,et al.  Applying Deep Learning to ICD-9 Multi-label Classification from Medical Records , 2016 .

[15]  Ming Zhang,et al.  Automatic classification of diseases from free-text death certificates for real-time surveillance , 2015, BMC Medical Informatics and Decision Making.

[16]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[17]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[18]  Anthony N. Nguyen,et al.  Automatic Diagnosis Coding of Radiology Reports: A Comparison of Deep Learning and Conventional Classification Methods , 2017, BioNLP.

[19]  Anthony N. Nguyen,et al.  Automatic ICD-10 classification of cancers from free-text death certificates , 2015, Int. J. Medical Informatics.

[20]  Liyana Shuib,et al.  Automatic ICD-10 multi-class classification of cause of death from plaintext autopsy reports through expert-driven feature selection , 2017, PloS one.

[21]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[22]  Kevin Duh,et al.  DyNet: The Dynamic Neural Network Toolkit , 2017, ArXiv.

[23]  Jimeng Sun,et al.  Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review , 2018, J. Am. Medical Informatics Assoc..

[24]  Geoffrey E. Hinton,et al.  Generating Text with Recurrent Neural Networks , 2011, ICML.

[25]  Noémie Elhadad,et al.  Multi-Label Classification of Patient Notes: Case Study on ICD Code Assignment , 2018, AAAI Workshops.

[26]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[27]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[28]  Svetla Boytcheva,et al.  Automatic Matching of ICD-10 codes to Diagnoses in Discharge Letters , 2011 .

[29]  Philip S. Yu,et al.  EHR Coding with Multi-scale Feature Attention and Structured Knowledge Graph Propagation , 2019, CIKM.

[30]  Mário J. Silva,et al.  Deep neural models for ICD-10 coding of death certificates and autopsy reports in free-text , 2018, J. Biomed. Informatics.

[31]  Andrew M. Dai,et al.  Improved Patient Classification with Language Model Pretraining Over Clinical Notes , 2019, ArXiv.

[32]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[33]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[34]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[35]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[36]  Zachariah Zhang,et al.  Deep EHR: Chronic Disease Prediction Using Medical Notes , 2018, MLHC.

[37]  Koldo Gojenola,et al.  Interpretable deep learning to map diagnostic texts to ICD-10 codes , 2019, Int. J. Medical Informatics.

[38]  Pierre Zweigenbaum,et al.  Hybrid methods for ICD-10 coding of death certificates , 2016, Louhi@EMNLP.

[39]  Brandon M. Greenwell,et al.  Interpretable Machine Learning , 2019, Hands-On Machine Learning with R.

[40]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[41]  Chia-Cheng Lee,et al.  Projection Word Embedding Model With Hybrid Sampling Training for Classifying ICD-10-CM Codes: Longitudinal Observational Study , 2019, JMIR medical informatics.

[42]  Mario Ciampi,et al.  Deep Convolution Neural Network for Extreme Multi-label Text Classification , 2018, HEALTHINF.

[43]  Guotong Xie,et al.  Automatic ICD Code Assignment based on ICD's Hierarchy Structure for Chinese Electronic Medical Records. , 2019, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[44]  Mário J. Silva,et al.  Deep Learning for Multi-Label ICD-9 Classification of Hospital Discharge Summaries , 2018 .

[45]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.