Automatic ICD code assignment of Chinese clinical notes based on multilayer attention BiRNN

International Classification of Diseases (ICD) code is an important label of electronic health record. The automatic ICD code assignment based on the narrative of clinical documents is an essential task which has drawn much attention recently. When Chinese clinical notes are the input corpus, the nature of Chinese brings some issues that need to be considered, such as the accuracy of word segmentation and the representation of single Chinese characters which contain semantics. Taking the lengthy text of patient notes and the representation of Chinese words into account, we present a multilayer attention bidirectional recurrent neural network (MA-BiRNN) model to implement the assignment of disease codes. A hierarchical approach is used to represent the feature of discharge summaries without manual feature engineering. The combination of character level embedding and word level embedding can improve the representation of words. Attention mechanism is introduced into bidirectional long short term memory networks, which helps to solve the performance dropping problem when plain recurrent neural networks encounter long text sequences. The experiment is carried out on a real-world dataset containing 7732 admission records in Chinese and 1177 unique ICD-10 labels. The proposed model achieves 0.639 and 0.766 in F1-score on full-level code and block-level code, respectively. It outperforms the baseline neural network models and achieves the lowest Hamming loss value. Ablation analysis indicates that the multilevel attention mechanism plays a decisive role in the system for dealing with Chinese clinical notes.

[1]  Frank D. Wood,et al.  Diagnosis code assignment: models and evaluation metrics , 2013, J. Am. Medical Informatics Assoc..

[2]  Mário J. Silva,et al.  A Deep Learning Method for ICD-10 Coding of Free-Text Death Certificates , 2017, EPIA.

[3]  Jinbo Bi,et al.  Large Scale Diagnostic Code Classification for Medical Patient Records , 2008, IJCNLP.

[4]  W. Bruce Croft,et al.  Combining classifiers in text categorization , 1996, SIGIR '96.

[5]  Yi Pan,et al.  Predicting MicroRNA-Disease Associations Based on Improved MicroRNA and Disease Similarities , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[7]  Huijuan Lu,et al.  Automatic ICD-10 coding algorithm using an improved longest common subsequence based on semantic similarity , 2017, PloS one.

[8]  Yi Pan,et al.  Prediction of lncRNA–disease associations based on inductive matrix completion , 2018, Bioinform..

[9]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[10]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  John F. Hurdle,et al.  Measuring diagnoses: ICD code accuracy. , 2005, Health services research.

[13]  Oladimeji Farri,et al.  Condensed Memory Networks for Clinical Diagnostic Inferencing , 2016, AAAI.

[14]  Runtong Zhang,et al.  A hierarchical method to automatically encode Chinese diagnoses through semantic similarity estimation , 2016, BMC Medical Informatics and Decision Making.

[15]  Koby Crammer,et al.  Automatic Code Assignment to Medical Text , 2007, BioNLP@ACL.

[16]  K. Bretonnel Cohen,et al.  A shared task involving multi-label classification of clinical free text , 2007, BioNLP@ACL.

[17]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[18]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[19]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[20]  Richárd Farkas,et al.  Automatic construction of rule-based ICD-9-CM coding systems , 2008, BMC Bioinformatics.

[21]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[22]  Walter F. Stewart,et al.  Doctor AI: Predicting Clinical Events via Recurrent Neural Networks , 2015, MLHC.

[23]  Cheng-Shang Chang,et al.  Predicting personality traits of Chinese users based on Facebook wall posts , 2015, 2015 24th Wireless and Optical Communication Conference (WOCC).

[24]  Anthony N. Nguyen,et al.  Automatic ICD-10 classification of cancers from free-text death certificates , 2015, Int. J. Medical Informatics.

[25]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[26]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[27]  Yongbin Liu,et al.  Ensemble method to joint inference for knowledge extraction , 2017, Expert Syst. Appl..

[28]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[29]  Wei Shi,et al.  Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.

[30]  Fenglong Ma,et al.  Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks , 2017, KDD.

[31]  T. H. Kyaw,et al.  Multiparameter Intelligent Monitoring in Intensive Care II: A public-access intensive care unit database* , 2011, Critical care medicine.

[32]  Jimeng Sun,et al.  RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism , 2016, NIPS.