Detection of infectious symptoms from VA emergency department and primary care clinical documentation

OBJECTIVE The majority of clinical symptoms are stored as free text in the clinical record, and this information can inform clinical decision support and automated surveillance efforts if it can be accurately processed into computer interpretable data. METHODS We developed rule-based algorithms and evaluated a natural language processing (NLP) system for infectious symptom detection using clinical narratives. Training (60) and testing (444) documents were randomly selected from VA emergency department, urgent care, and primary care records. Each document was processed with NLP and independently manually reviewed by two clinicians with adjudication by referee. Infectious symptom detection rules were developed in the training set using keywords and SNOMED-CT concepts, and subsequently evaluated using the testing set. RESULTS Overall symptom detection performance was measured with a precision of 0.91, a recall of 0.84, and an F measure of 0.87. Overall symptom detection with assertion performance was measured with a precision of 0.67, a recall of 0.62, and an F measure of 0.64. Among those instances in which the automated system matched the reference set determination for symptom, the system correctly detected 84.7% of positive assertions, 75.1% of negative assertions, and 0.7% of uncertain assertions. CONCLUSION This work demonstrates how processed text could enable detection of non-specific symptom clusters for use in automated surveillance activities.

[1]  Colin Price,et al.  Application of Technology: Read Code Quality Assurance: From Simple Syntax to Semantic Stability , 1998, J. Am. Medical Informatics Assoc..

[2]  Yang Huang,et al.  Research Paper: A Pilot Study of Contextual UMLS Indexing to Improve the Precision of Concept-based Representation in XML-structured Clinical Radiology Reports , 2003, J. Am. Medical Informatics Assoc..

[3]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[4]  Lisa J. Trigg,et al.  Roundtable on Bioterrorism Detection , 2002 .

[5]  Christopher G Chute,et al.  Discovering peripheral arterial disease cases from radiology notes using natural language processing. , 2010, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[6]  Theodore Speroff,et al.  eQuality: electronic quality assessment from narrative clinical reports. , 2006, Mayo Clinic proceedings.

[7]  Prakash M. Nadkarni,et al.  Research Paper: Use of General-purpose Negation Detection to Augment Concept Indexing of Medical Documents: A Quantitative Study Using the UMLS , 2001, J. Am. Medical Informatics Assoc..

[8]  Wendy W. Chapman,et al.  Fever detection from free-text clinical records for biosurveillance , 2004, Journal of Biomedical Informatics.

[9]  D P Pretschner,et al.  The compositional approach for representing medical concept systems. , 1995, Medinfo. MEDINFO.

[10]  Yang Huang,et al.  A novel hybrid approach to automated negation detection in clinical radiology reports. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[11]  George Hripcsak,et al.  The sublanguage of cross-coverage , 2002, AMIA.

[12]  John Domingue,et al.  Artificial Intelligence: Methodology, Systems, and Applications, 12th International Conference, AIMSA 2006, Varna, Bulgaria, September 12-15, 2006, Proceedings , 2006, AIMSA.

[13]  Peter L. Elkin,et al.  UMLS Concept Indexing for Production Databases: A Feasibility Study , 2001, J. Am. Medical Informatics Assoc..

[14]  Robert H. Baud,et al.  Compositional and enumerative designs for medical language representation , 1997, AMIA.

[15]  George Hripcsak,et al.  Automated encoding of clinical documents based on natural language processing. , 2004, Journal of the American Medical Informatics Association : JAMIA.

[16]  Hongfang Liu,et al.  A study of abbreviations in the UMLS , 2001, AMIA.

[17]  N. Ohashi,et al.  Agreement , 2002 .

[18]  A. Rector Thesauri and Formal Classifications: Terminologies for People and Machines , 1998, Methods of Information in Medicine.

[19]  Steven H. Brown,et al.  VistA - U.S. Department of Veterans Affairs national-scale HIS , 2003, Int. J. Medical Informatics.

[20]  Peter J. Haug,et al.  Natural language processing to extract medical problems from electronic clinical documents: Performance evaluation , 2006, J. Biomed. Informatics.

[21]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[22]  J. Berman Pathology abbreviated: a long review of short terms. , 2009, Archives of pathology & laboratory medicine.

[23]  Youngja Park,et al.  Hybrid Text Mining for Finding Abbreviations and their Definitions , 2001, EMNLP.

[24]  Peter J. Haug,et al.  Comparing Natural Language Processing Tools to Extract Medical Problems from Narrative Text , 2005, AMIA.

[25]  S. Trent Rosenbloom,et al.  Derivation and evaluation of a document-naming nomenclature. , 2001, Journal of the American Medical Informatics Association : JAMIA.

[26]  Peter L. Elkin,et al.  A controlled trial of automated classification of negation from clinical notes , 2005, BMC Medical Informatics Decis. Mak..

[27]  S. Trent Rosenbloom,et al.  NLP-based Identification of Pneumonia Cases from Free-Text Radiological Reports , 2008, AMIA.

[28]  Christopher G. Chute,et al.  A randomized controlled trial of automated term composition , 1998, AMIA.

[29]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[30]  Carol Friedman,et al.  A Study of Abbreviations in Clinical Notes , 2007, AMIA.

[31]  Olivier Bodenreider,et al.  The NLM Indexing Initiative , 2000, AMIA.

[32]  Wendy W. Chapman,et al.  Evaluation of negation phrases in narrative clinical reports , 2001, AMIA.

[33]  Christopher G. Chute,et al.  A randomized controlled trial of concept based indexing of Web page content , 2000, AMIA.

[34]  Peter D. Stetson,et al.  Case Report: Iterative Evaluation of the Health Level 7 - Logical Observation Identifiers Names and Codes Clinical Document Ontology for Representing Clinical Document Names: A Case Report , 2009, J. Am. Medical Informatics Assoc..

[35]  Werner Ceusters,et al.  Negative findings in electronic health records and biomedical ontologies: A realist approach , 2007, Int. J. Medical Informatics.

[36]  Leonard W. D'Avolio,et al.  Evaluation of a generalizable approach to clinical information retrieval using the automated retrieval console (ARC) , 2010, J. Am. Medical Informatics Assoc..

[37]  C Price,et al.  Anatomical characterisation of surgical procedures in the Read Thesaurus. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[38]  Christopher G. Chute,et al.  A clinically derived terminology: qualification to reduction , 1997, AMIA.

[39]  E. B Schulz,et al.  Application of Technology: Symbolic Anatomic Knowledge Representation in the Read Codes Version 3: Structure and Application , 1997, J. Am. Medical Informatics Assoc..

[40]  Daniel Pacholczyk,et al.  Optimistic vs. Pessimistic Interpretation of Linguistic Negation , 2002, AIMSA.

[41]  Peter L. Elkin,et al.  A randomized controlled trial of the accuracy of clinical record retrieval using SNOMED-RT as compared with ICD9-CM , 2001, AMIA.

[42]  George Hripcsak,et al.  Technical Brief: Agreement, the F-Measure, and Reliability in Information Retrieval , 2005, J. Am. Medical Informatics Assoc..

[43]  Joshua C. Denny,et al.  Identifying QT prolongation from ECG impressions using Natural Language Processing and Negation Detection , 2007, MedInfo.

[44]  Naomi Sager,et al.  Research Paper: Natural Language Processing and the Representation of Clinical Data , 1994, J. Am. Medical Informatics Assoc..

[45]  S. Trent Rosenbloom,et al.  eQuality for All: Extending Automated Quality Measurement of Free Text Clinical Narratives , 2008, AMIA.

[46]  Scott T. Weiss,et al.  Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system , 2006, BMC Medical Informatics Decis. Mak..

[47]  Kenneth D. Mandl,et al.  Research Paper: The Value of Patient Self-report for Disease Surveillance , 2007, J. Am. Medical Informatics Assoc..

[48]  Ilya M. Goldin,et al.  Learning to Detect Negation with ‘Not’ in Medical Texts , 2003 .

[49]  Peter L. Elkin,et al.  Detection of Blood Culture Bacterial Contamination using Natural Language Processing , 2009, AMIA.

[50]  Long H. Ngo,et al.  Implementation and Evaluation of Four Different Methods of Negation Detection , 2007 .

[51]  Mark Holodniy,et al.  Effective Detection of the 2009 H1N1 Influenza Pandemic in U.S. Veterans Affairs Medical Centers Using a National Electronic Biosurveillance System , 2010, PloS one.

[52]  Lucila Ohno-Machado,et al.  Research Paper: Monitoring Device Safety in Interventional Cardiology , 2006, J. Am. Medical Informatics Assoc..

[53]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.