Automatic ICD Coding via Interactive Shared Representation Networks with Self-distillation Mechanism

The ICD coding task aims at assigning codes of the International Classification of Diseases in clinical notes. Since manual coding is very laborious and prone to errors, many methods have been proposed for the automatic ICD coding task. However, existing works either ignore the long-tail of code frequency or the noisy clinical notes. To address the above issues, we propose an Interactive Shared Representation Network with Self-Distillation mechanism. Specifically, an interactive shared representation network targets building connections among codes while modeling the cooccurrence, consequently alleviating the longtail problem. Moreover, to cope with the noisy text issue, we encourage the model to focus on the clinical note’s noteworthy part and extract valuable information through a self-distillation learning mechanism. Experimental results on two MIMIC datasets demonstrate the effectiveness of our method.

[1]  W. Bruce Croft,et al.  Combining classifiers in text categorization , 1996, SIGIR '96.

[2]  Sotirios A. Tsaftaris,et al.  Ontological attention ensembles for capturing semantic concepts in ICD code prediction from clinical text , 2019, EMNLP.

[3]  Nathalie Japkowicz,et al.  The class imbalance problem: A systematic study , 2002, Intell. Data Anal..

[4]  Yubo Chen,et al.  HyperCore: Hyperbolic and Co-graph Representation for Automatic ICD Coding , 2020, ACL.

[5]  Qingming Huang,et al.  Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks , 2015, ECCV.

[6]  Minho Lee,et al.  Attentively Embracing Noise for Robust Latent Representation in BERT , 2020, COLING.

[7]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[8]  Philip S. Yu,et al.  EHR Coding with Multi-scale Feature Attention and Structured Knowledge Graph Propagation , 2019, CIKM.

[9]  ChengXiang Zhai,et al.  Adapting Sequence to Sequence models for Text Normalization in Social Media , 2019, ICWSM.

[10]  Anthony N. Nguyen,et al.  A Label Attention Model for ICD Coding from Clinical Text , 2020, IJCAI.

[11]  D. Adams,et al.  Addressing medical coding and billing part II: a strategy for achieving compliance. A risk management approach for reducing coding and billing errors. , 2002, Journal of the National Medical Association.

[12]  Frank D. Wood,et al.  Diagnosis code assignment: models and evaluation metrics , 2013, J. Am. Medical Informatics Assoc..

[13]  Eric Xing,et al.  Generalized Zero-shot ICD Coding , 2019, ArXiv.

[14]  J. Shull Digital Health and the State of Interoperable Electronic Health Records , 2019, JMIR medical informatics.

[15]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[16]  Noémie Elhadad,et al.  Multi-Label Classification of Patient Notes: Case Study on ICD Code Assignment , 2018, AAAI Workshops.

[17]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[18]  Minho Lee,et al.  Stacked DeBERT: All Attention in Incomplete Data for Text Classification , 2020, Neural networks : the official journal of the International Neural Network Society.

[19]  Andrew Y. Ng,et al.  Improving palliative care with deep learning , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[20]  Deepali Deshpande,et al.  Twitter Sentiment Analysis System , 2018, International Journal of Computer Applications.

[21]  A Burgun,et al.  Automated Classification of Free-text Pathology Reports for Registration of Incident Cases of Cancer , 2011, Methods of Information in Medicine.

[22]  Nigam H. Shah,et al.  Improving palliative care with deep learning , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[23]  Robert A. Jenders,et al.  A systematic literature review of automated clinical coding and classification systems , 2010, J. Am. Medical Informatics Assoc..

[24]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[25]  Peerapon Vateekul,et al.  A study of sentiment analysis using deep learning techniques on Thai Twitter data , 2016, 2016 13th International Joint Conference on Computer Science and Software Engineering (JCSSE).

[26]  Martial Hebert,et al.  Learning to Model the Tail , 2017, NIPS.

[27]  Dat T. Huynh,et al.  A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Fei Li,et al.  ICD Coding from Clinical Text Using Multi-Filter Residual Convolutional Neural Network , 2019, AAAI.

[29]  Chen Huang,et al.  Learning Deep Representation for Imbalanced Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[31]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[32]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[33]  JapkowiczNathalie,et al.  The class imbalance problem: A systematic study , 2002 .

[34]  Pengtao Xie,et al.  A Neural Architecture for Automated ICD Coding , 2018, ACL.