Clinical coding support based on structured data stored in electronic health records

Clinical coding is an increasingly essential process within health organizations, usually performed manually and entailing several challenges: its administrative burden, raising costs and eventual errors. To address this issue, several coding support systems have been proposed across the literature. However, these systems are based on text processing methods that may be limited by poor text quality, ambiguity and lack of annotated resources. As electronic health record systems tend to implement more structured data formats, we propose a methodology for coding support based on structured clinical data collected during inpatient care from a semi-structured electronic health record. We follow a statistical learning paradigm and investigate several building blocks of the methodology to assess the feasibility of the approach. We present and discuss preliminary results obtained with real data extracted from an Internal Medicine department and identify several measures to further develop the methodology, model performance and generalizability.

[1]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[2]  Richárd Farkas,et al.  Automatic construction of rule-based ICD-9-CM coding systems , 2008, BMC Bioinformatics.

[3]  Özlem Uzuner,et al.  Three Approaches to Automatic Assignment of ICD-9-CM Codes to Radiology Reports , 2007, AMIA.

[4]  Jinbo Bi,et al.  Large Scale Diagnostic Code Classification for Medical Patient Records , 2008, IJCNLP.

[5]  Jennifer G. Dy,et al.  Medical coding classification by leveraging inter-code relationships , 2010, KDD.

[6]  D. Bates,et al.  Clinical Decision Support Systems , 1999, Health Informatics.

[7]  John F. Hurdle,et al.  Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research , 2008, Yearbook of Medical Informatics.

[8]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[9]  Yitao Zhang A Hierarchical Approach to Encoding Medical Concepts for Clinical Notes , 2008, ACL.

[10]  C. DesRoches,et al.  Health Information Technology in the United States Driving Toward Delivery System Change 2012 , 2012 .

[11]  Leonardo Franco,et al.  Missing data imputation using statistical and machine learning methods in a real breast cancer problem , 2010, Artif. Intell. Medicine.

[12]  Christopher G. Chute,et al.  Research Paper: Automating the Assignment of Diagnosis Codes to Patient Encounters Using Example-based and Machine Learning Techniques , 2006, J. Am. Medical Informatics Assoc..

[13]  Everton Alvares Cherman,et al.  Incorporating label dependency into the binary relevance framework for multi-label classification , 2012, Expert Syst. Appl..

[14]  Chih-Chuan Chen,et al.  Conceptual-driven classification for coding advise in health insurance reimbursement , 2011, Artif. Intell. Medicine.

[15]  Robert A. Jenders,et al.  A systematic literature review of automated clinical coding and classification systems , 2010, J. Am. Medical Informatics Assoc..

[16]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[17]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[18]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[19]  C. Ake Rounding After Multiple Imputation With Non-binary Categorical Covariates , 2005 .