A controlled trial of automated classification of negation from clinical notes

BackgroundIdentification of negation in electronic health records is essential if we are to understand the computable meaning of the records: Our objective is to compare the accuracy of an automated mechanism for assignment of Negation to clinical concepts within a compositional expression with Human Assigned Negation. Also to perform a failure analysis to identify the causes of poorly identified negation (i.e. Missed Conceptual Representation, Inaccurate Conceptual Representation, Missed Negation, Inaccurate identification of Negation).Methods41 Clinical Documents (Medical Evaluations; sometimes outside of Mayo these are referred to as History and Physical Examinations) were parsed using the Mayo Vocabulary Server Parsing Engine. SNOMED-CT™ was used to provide concept coverage for the clinical concepts in the record. These records resulted in identification of Concepts and textual clues to Negation. These records were reviewed by an independent medical terminologist, and the results were tallied in a spreadsheet. Where questions on the review arose Internal Medicine Faculty were employed to make a final determination.ResultsSNOMED-CT was used to provide concept coverage of the 14,792 Concepts in 41 Health Records from John's Hopkins University. Of these, 1,823 Concepts were identified as negative by Human review. The sensitivity (Recall) of the assignment of negation was 97.2% (p < 0.001, Pearson Chi-Square test; when compared to a coin flip). The specificity of assignment of negation was 98.8%. The positive likelihood ratio of the negation was 81. The positive predictive value (Precision) was 91.2%ConclusionAutomated assignment of negation to concepts identified in health records based on review of the text is feasible and practical. Lexical assignment of negation is a good test of true Negativity as judged by the high sensitivity, specificity and positive likelihood ratio of the test. SNOMED-CT had overall coverage of 88.7% of the concepts being negated.

[1]  Yang Huang,et al.  Research Paper: A Pilot Study of Contextual UMLS Indexing to Improve the Precision of Concept-based Representation in XML-structured Clinical Radiology Reports , 2003, J. Am. Medical Informatics Assoc..

[2]  E. B Schulz,et al.  Application of Technology: Symbolic Anatomic Knowledge Representation in the Read Codes Version 3: Structure and Application , 1997, J. Am. Medical Informatics Assoc..

[3]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[4]  Ralph Grishman,et al.  The restriction language for computer grammars of natural language , 1975, CACM.

[5]  Daniel Pacholczyk,et al.  Optimistic vs. Pessimistic Interpretation of Linguistic Negation , 2002, AIMSA.

[6]  Ralph Grishman,et al.  The linguistic string parser , 1973, AFIPS National Computer Conference.

[7]  Christopher G. Chute,et al.  A randomized controlled trial of concept based indexing of Web page content , 2000, AMIA.

[8]  M A Musen,et al.  The Separation of Reviewing Knowledge from Medical Knowledge , 1995, Methods of Information in Medicine.

[9]  Alan L. Rector,et al.  Terminological systems: bridging the generation gap , 1997, AMIA.

[10]  Lawrence M. Fagan,et al.  Development of a Controlled Medical Terminology: Knowledge Acquisition and Knowledge Representation , 1995, Methods of Information in Medicine.

[11]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[12]  C Price,et al.  Anatomical characterisation of surgical procedures in the Read Thesaurus. , 1996, Proceedings : a conference of the American Medical Informatics Association. AMIA Fall Symposium.

[13]  Olivier Bodenreider,et al.  The NLM Indexing Initiative , 2000, AMIA.

[14]  Christopher G. Chute,et al.  A clinically derived terminology: qualification to reduction , 1997, AMIA.

[15]  Robert H. Baud,et al.  Compositional and enumerative designs for medical language representation , 1997, AMIA.

[16]  NAOMI SAGER,et al.  Syntactic Analysis of Natural Language , 1967, Adv. Comput..

[17]  A. Rector Thesauri and Formal Classifications: Terminologies for People and Machines , 1998, Methods of Information in Medicine.

[18]  Christopher G. Chute,et al.  A randomized controlled trial of automated term composition , 1998, AMIA.

[19]  Peter L. Elkin,et al.  UMLS Concept Indexing for Production Databases: A Feasibility Study , 2001, J. Am. Medical Informatics Assoc..

[20]  D P Pretschner,et al.  The compositional approach for representing medical concept systems. , 1995, Medinfo. MEDINFO.

[21]  Colin Price,et al.  Application of Technology: Read Code Quality Assurance: From Simple Syntax to Semantic Stability , 1998, J. Am. Medical Informatics Assoc..

[22]  Alan R. Aronson,et al.  Towards linking patients and clinical information: detecting UMLS concepts in e-mail , 2003, J. Biomed. Informatics.

[23]  Prakash M. Nadkarni,et al.  Research Paper: Use of General-purpose Negation Detection to Augment Concept Indexing of Medical Documents: A Quantitative Study Using the UMLS , 2001, J. Am. Medical Informatics Assoc..