Empirical, automated vocabulary discovery using large text corpora and advanced natural language processing tools.

A major impediment to the full benefit of electronic medical records is the lack of a comprehensive clinical vocabulary. Most existing vocabularies do not allow the full expressiveness of clinical diagnoses and findings that are often qualified by modifiers relating to severity, acuity, and temporal factors. One reason for the lack of expressivity is the inability of traditional manual construction techniques to identify the diversity of language used by clinicians. This study used advanced natural language processing tools to identify terminology in a clinical findings domain, compare its coverage with the UMLS Metathesaurus, and quantify the effort required to discover the additional terminology. It was found that substantial amounts of phrases and individual modifiers were not present in the UMLS Metathesaurus and that modest effort in human time and computer processing were needed to obtain the larger quantity of terms.

[1]  Carol Friedman,et al.  Research Paper: A General Natural-language Text Processor for Clinical Radiology , 1994, J. Am. Medical Informatics Assoc..

[2]  C. McDonald,et al.  Reminders to physicians from an introspective computer medical record. A two-year randomized trial. , 1984, Annals of internal medicine.

[3]  Mark S. Tuttle The Position of the Canon Group: A Reality Check , 1994, J. Am. Medical Informatics Assoc..

[4]  D A Evans,et al.  Toward Representations for Medical Concepts , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[5]  D A Evans,et al.  Automatic Indexing of Abstracts via Natural-language Processing Using a Simple Thesaurus , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[6]  R. Seller,et al.  Differential diagnosis of common complaints , 1986 .

[7]  J. Cimino,et al.  Toward a Medical-concept Representation Language , 2022 .

[8]  Carol Friedman,et al.  Research Paper: The Canon Group's Effort: Working Toward a Merged Model , 1995, J. Am. Medical Informatics Assoc..

[9]  George Hripcsak,et al.  Research Paper: Knowledge-based Approaches to the Maintenance of a Large Controlled Medical Terminology , 1994, J. Am. Medical Informatics Assoc..

[10]  J M Teich,et al.  Computerized physician order entry and quality of care. , 1994, Quality management in health care.