A Unified Review of Deep Learning for Automated Medical Coding

Automated medical coding, an essential task for healthcare operation and delivery, makes unstructured data manageable by predicting medical codes from clinical documents. Recent advances in deep learning models in natural language processing have been widely applied to this task. However, it lacks a unified view of the design of neural network architectures for medical coding. This review proposes a unified framework to provide a general understanding of the building blocks of medical coding models and summarizes recent advanced models under the proposed framework. Our unified framework decomposes medical coding into four main components, i.e., encoder modules for text feature extraction, mechanisms for building deep encoder architectures, decoder modules for transforming hidden representations into medical codes, and the usage of auxiliary information. Finally, we discuss key research challenges and future directions.

[1]  Geoffrey J. Gordon,et al.  Artificial Intelligence in Medicine: 17th Conference on Artificial Intelligence in Medicine, AIME 2019, Poznan, Poland, June 26–29, 2019, Proceedings , 2019, Lecture Notes in Computer Science.

[2]  Shay B. Cohen,et al.  Proceedings of ACL , 2013 .

[3]  Philip S. Yu,et al.  EHR Coding with Multi-scale Feature Attention and Structured Knowledge Graph Propagation , 2019, CIKM.

[4]  Yubo Chen,et al.  HyperCore: Hyperbolic and Co-graph Representation for Automatic ICD Coding , 2020, ACL.

[5]  Multi-domain Clinical Natural Language Processing with MedCAT: the Medical Concept Annotation Toolkit , 2020, ArXiv.

[6]  Anthony N. Nguyen,et al.  A Label Attention Model for ICD Coding from Clinical Text , 2020, IJCAI.

[7]  Erik Cambria,et al.  Multitask Recalibrated Aggregation Network for Medical Code Prediction , 2021, ECML/PKDD.

[8]  Thomas Searle,et al.  Experimental Evaluation and Development of a Silver-Standard for the MIMIC-III Clinical Coding Dataset , 2020, BioNLP.

[9]  Jimeng Sun,et al.  Fusion: Towards Automated ICD Coding via Feature Compression , 2021, FINDINGS.

[10]  P. Ziprin,et al.  Systematic review of discharge coding accuracy. , 2012, Journal of public health.

[11]  Erik Cambria,et al.  Dilated Convolutional Attention Network for Medical Code Assignment from Clinical Text , 2020, CLINICALNLP.

[12]  Thai-Hoang Pham,et al.  TransICD: Transformer Based Code-wise Attention Model for Explainable ICD Coding , 2021, AIME.

[13]  Joana Ferreira,et al.  Problems and Barriers during the Process of Clinical Coding: a Focus Group Study of Coders’ Perceptions , 2020, Journal of Medical Systems.

[14]  Sharon Campbell,et al.  Computer-assisted clinical coding: A narrative review of the literature on its benefits, limitations, implementation and impact on clinical coding professionals , 2019, Health information management : journal of the Health Information Management Association of Australia.

[15]  Eric B. Durbin,et al.  Assigning ICD-O-3 codes to pathology reports using neural multi-task training with hierarchical regularization , 2021, BCB.

[16]  Fei Teng,et al.  Automatic Medical Code Assignment via Deep Learning Approach for Intelligent Healthcare , 2020, IEEE Journal of Biomedical and Health Informatics.

[17]  J. Garland The New England Journal of Medicine. , 1961, Canadian Medical Association journal.

[18]  Mat'uvs Falis,et al.  CoPHE: A Count-Preserving Hierarchical Evaluation Metric in Large-Scale Multi-Label Text Classification , 2021, EMNLP.

[19]  Jinbo Bi,et al.  Large Scale Diagnostic Code Classification for Medical Patient Records , 2008, IJCNLP.

[20]  Pengtao Xie,et al.  A Neural Architecture for Automated ICD Coding , 2017, ACL.

[21]  Koby Crammer,et al.  Automatic Code Assignment to Medical Text , 2007, BioNLP@ACL.

[22]  Chih-Jen Lin,et al.  Parameter Selection: Why We Should Pay More Attention to It , 2021, ACL/IJCNLP.

[23]  N. Razavian,et al.  BERT-XML: Large Scale Automated ICD Coding Using BERT Pretraining , 2020, CLINICALNLP.

[24]  Richárd Farkas,et al.  Automatic construction of rule-based ICD-9-CM coding systems , 2008, BMC Bioinformatics.

[25]  Robert A. Jenders,et al.  A systematic literature review of automated clinical coding and classification systems , 2010, J. Am. Medical Informatics Assoc..

[26]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[27]  Berthier A. Ribeiro-Neto,et al.  A hierarchical approach to the automatic categorization of medical documents , 1998, CIKM '98.

[28]  Honghan Wu,et al.  Explainable Automated Coding of Clinical Notes using Hierarchical Label-wise Attention Networks and Label Embedding Initialisation , 2021, J. Biomed. Informatics.

[29]  Fei Li,et al.  ICD Coding from Clinical Text Using Multi-Filter Residual Convolutional Neural Network , 2019, AAAI.

[30]  Erik Cambria,et al.  Multi-task Balanced and Recalibrated Network for Medical Code Prediction , 2021, ArXiv.

[31]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[32]  Guoyin Wang,et al.  Joint Embedding of Words and Labels for Text Classification , 2018, ACL.

[33]  Noah Schaefferkoetter,et al.  Limitations of Transformers on Clinical Text Classification , 2021, IEEE Journal of Biomedical and Health Informatics.

[34]  Volker Tresp,et al.  Description-based Label Attention Classifier for Explainable ICD-9 Classification , 2021, WNUT.

[35]  Shirui Pan,et al.  Medical Code Assignment with Gated Convolution and Note-Code Interaction , 2021, FINDINGS.

[36]  Irene Pala,et al.  BMC Medical Informatics and Decision Making , 2014, BMC Medical Informatics and Decision Making.

[37]  日本公衆衛生協会 公衆衞生 = The journal of public health , 1950 .

[38]  M. Saeed,et al.  Multiparameter Intelligent Monitoring in Intensive Care Ii (Mimic-Ii): A Public-Access Intensive Care Unit Database , 2011 .

[39]  Cédrick Fairon,et al.  Machine learning and features selection for semi-automatic ICD-9-CM encoding , 2010, Louhi@NAACL-HLT.

[40]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[41]  Yijia Zhang,et al.  Medical code prediction via capsule networks and ICD knowledge , 2021, BMC Medical Informatics and Decision Making.

[42]  Pekka Marttinen,et al.  Does the Magic of BERT Apply to Medical Code Assignment? A Quantitative Study , 2021, Comput. Biol. Medicine.

[43]  Frank D. Wood,et al.  Diagnosis code assignment: models and evaluation metrics , 2013, J. Am. Medical Informatics Assoc..

[44]  Oladimeji Farri,et al.  Condensed Memory Networks for Clinical Diagnostic Inferencing , 2016, AAAI.

[45]  Noémie Elhadad,et al.  Multi-Label Classification of Patient Notes: Case Study on ICD Code Assignment , 2018, AAAI Workshops.