Using Natural Language Processing to Improve Accuracy of Automated Notifiable Disease Reporting

We examined whether using a natural language processing (NLP) system results in improved accuracy and completeness of automated electronic laboratory reporting (ELR) of notifiable conditions. We used data from a community-wide health information exchange that has automated ELR functionality. We focused on methicillin-resistant Staphylococcus Aureus (MRSA), a reportable infection found in unstructured, free-text culture result reports. We used the Regenstrief EXtraction tool (REX) for this work. REX processed 64,554 reports that mentioned MRSA and we compared its output to a gold standard (human review). REX correctly identified 39,491(99.96%) of the 39,508 reports positive for MRSA, and committed only 74 false positive errors. It achieved high sensitivity, specificity, positive predicted value and F-measure. REX identified over two times as many MRSA positive reports as the ELR system without NLP. Using NLP can improve the completeness and accuracy of automated ELR.

[1]  Clement J. McDonald,et al.  A Natural Language Processing System to Extract and Code Concepts Relating to Congestive Heart Failure from Chest Radiology Reports , 2006, AMIA.

[2]  D. Linkin,et al.  Improving disease reporting by clinicians: the effect of an internet-based intervention. , 2008, Journal of public health management and practice : JPHMP.

[3]  Lonnie Blevins,et al.  The Indiana network for patient care: a working local health information infrastructure. An example of a working infrastructure collaboration that links data from five health systems and hundreds of millions of entries. , 2005, Health affairs.

[4]  Clement J. McDonald,et al.  Using A Natural Language Processing System to Extract and Code Family History Data from Admission Reports , 2006, AMIA.

[5]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[6]  J. Marc Overhage,et al.  A comparison of the completeness and timeliness of automated electronic laboratory reporting and spontaneous reporting of notifiable conditions. , 2008, American journal of public health.

[7]  George Hripcsak,et al.  Research Paper: Automated Tuberculosis Detection , 1997, J. Am. Medical Informatics Assoc..

[8]  Carol Friedman,et al.  Limited parsing of notational text visit notes: ad-hoc vs. NLP approaches , 2000, AMIA.

[9]  C Friedman,et al.  Tolerating spelling errors during patient validation. , 1992, Computers and biomedical research, an international journal.

[10]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[11]  R. Vogt,et al.  The surveillance of communicable disease in Vermont: who reports? , 1991, Public health reports.

[12]  George Hripcsak,et al.  Automated encoding of clinical documents based on natural language processing. , 2004, Journal of the American Medical Informatics Association : JAMIA.

[13]  C J McDonald,et al.  Electronic laboratory reporting: barriers, solutions and findings. , 2001, Journal of public health management and practice : JPHMP.

[14]  C. McDonald,et al.  LOINC, a universal standard for identifying laboratory observations: a 5-year update. , 2003, Clinical chemistry.