Leveraging Semantics in WordNet to Facilitate the Computer-Assisted Coding of ICD-11

The International Classification of Diseases (ICD) not only serves as the bedrock for health statistics but also provides a holistic overview of every health aspect of life. This study aims to facilitate the computer-assisted coding of the 11th revision of the ICD (ICD-11) by leveraging the data structures of ICD-11 and semantics in WordNet. First, a computer-assisted coding framework using WordNet and ICD-11 application programming interface (API) is proposed. Secondly, a network based on entity relations in ICD-11 and synonym sets in WordNet, called CodeNet, is developed. Thirdly, an algorithm for generating ICD-11 code candidates from CodeNet with two tuning parameters on the basis of the input of disease-related text is illustrated. Finally, the discharge summaries in the Medical Information Mart for Intensive Care III database and textual information from ICD-11 entities are used to evaluate the proposed method. Experimental results indicate that the proposed coding method achieves a precision of 84% and a recall of 89% relative to a precision of 65% and a recall of 81% achieved with the existing ICD-11 API. The proposed method also outperforms other methods in the literature by reducing a failure rate of up to 8% in ICD-11 coding. The proposed thresholds of similarity and percentage can be applied to tuning the performance of our method to meet different coding needs. In sum, improving the new structures of ICD-11 with the use of semantics in WordNet can help develop more reliable computer-aided coding systems for ICD-11 coders.

[1]  E. Patterson,et al.  Towards computer-assisted coding: A case study of ‘charge by documentation’ software at an endoscopy clinic , 2014 .

[2]  Frank D. Wood,et al.  Diagnosis code assignment: models and evaluation metrics , 2013, J. Am. Medical Informatics Assoc..

[3]  Koby Crammer,et al.  Automatic Code Assignment to Medical Text , 2007, BioNLP@ACL.

[4]  Wei Lu,et al.  A hybrid approach for measuring semantic similarity based on IC-weighted path distance in WordNet , 2017, Journal of Intelligent Information Systems.

[5]  Yi Zhang,et al.  Scalable Wide and Deep Learning for Computer Assisted Coding , 2018, NAACL.

[6]  Kazuhiko Ohe,et al.  Development of Structured ICD-10 and its Application to Computer-Assisted ICD Coding , 2010, MedInfo.

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[9]  Amanda Spink,et al.  Semantics and the medical web: a review of barriers and breakthroughs in effective healthcare query. , 2004, Health information and libraries journal.

[10]  H. Quan,et al.  Validating ICD coding algorithms for diabetes mellitus from administrative data. , 2010, Diabetes research and clinical practice.

[11]  Manolis Tsiknakis,et al.  Semantic biomedical resource discovery: a Natural Language Processing framework , 2015, BMC Medical Informatics and Decision Making.

[12]  M. Ghazisaeedi,et al.  A Three-Phase Decision Model of Computer-Aided Coding for the Iranian Classification of Health Interventions (IRCHI) , 2017, Acta informatica medica : AIM : journal of the Society for Medical Informatics of Bosnia & Herzegovina : casopis Drustva za medicinsku informatiku BiH.

[13]  Anthony R. Davis,et al.  A Method for Modeling Co-Occurrence Propensity of Clinical Codes with Application to ICD-10-PCS Auto-Coding , 2015, J. Am. Medical Informatics Assoc..

[14]  C. Mathers,et al.  Revising the ICD: explaining the WHO approach , 2016, The Lancet.

[15]  Richárd Farkas,et al.  Automatic construction of rule-based ICD-9-CM coding systems , 2008, BMC Bioinformatics.

[16]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[17]  H. Mahomed,et al.  Training and support to improve ICD coding quality: A controlled before-and-after impact evaluation. , 2017, South African medical journal = Suid-Afrikaanse tydskrif vir geneeskunde.

[18]  Darwin Martínez,et al.  Computer Assisted Assignment of ICD Codes for Primary Admission Diagnostic in ICUs , 2017 .

[19]  Cédrick Fairon,et al.  Machine learning and features selection for semi-automatic ICD-9-CM encoding , 2010, Louhi@NAACL-HLT.

[20]  C. Dolea,et al.  World Health Organization , 1949, International Organization.

[21]  Carlos Martínez,et al.  The freetext matching algorithm: a computer program to extract diagnoses and causes of death from unstructured text in electronic health records , 2012, BMC Medical Informatics and Decision Making.

[22]  Leon Derczynski,et al.  Complementarity, F-score, and NLP Evaluation , 2016, LREC.

[23]  Samson W. Tu,et al.  Use of ontology structure and Bayesian models to aid the crowdsourcing of ICD-11 sanctioning rules , 2017, J. Biomed. Informatics.