Tagging Patient Notes With ICD-9 Codes

There is substantial growth in the amount of medical/data being generated in hospitals. With over 96% adoption rate[1], Electronic Medical/Health Records are used to store most of this medical data. If harnessed correctly, this medium provides a very convenient platform for secondary data analysis of these records to improve medical and patient care. One crucial feature of the information stored in these systems are ICD9-diagnosis codes, which are used for billing purposes and integration to other databases. These codes are assigned to medical text and require expert annotators with experience and training. In this paper we formulate this problem as a multi-label classification problem and propose a deep learning framework to classify the ICD-9 codes a patient is assigned at the end of a visit. We demonstrate that a simple LSTM model with a single layer of non-linearity can learn to classify patient notes with their corresponding ICD-9 labels moderately well.