Auditing concept categorizations in the UMLS

The Unified Medical Language System (UMLS) integrates about 880,000 concepts from 100 biomedical terminologies. Each concept is categorized to at least one semantic type of the Semantic Network. During the integration, it is unavoidable that some categorization errors and inconsistencies will be introduced. In this paper, we present an auditing technique to find such errors and inconsistencies. Our technique is based on an expert reviewing the pure intersections of meta-semantic types of a metaschema, a compact abstract view of the UMLS Semantic Network. We use a divide and conquer approach, handling differently small pure intersections and medium to large pure intersections. By using this approach, we limit the number of concepts reviewed, for which we expect a high percentage of errors. We reviewed all concepts in 657 pure intersections containing one to 10 concepts. Various kinds of errors are identified and the analysis of the results are presented in the paper. Also, we checked the pure intersections containing more than 10 concepts for their semantic soundness, where the semantically suspicious pure intersections are presented in the paper and their concepts are reviewed.

[1]  Olivier Bodenreider,et al.  Circular hierarchical relationships in the UMLS: etiology, diagnosis, treatment, complications and prevention , 2001, AMIA.

[2]  Naomi C. Broering,et al.  High performance medical libraries: Advances in information management for the virtual era , 1993 .

[3]  P L Schuyler,et al.  The UMLS Metathesaurus: representing different views of biomedical concepts. , 1993, Bulletin of the Medical Library Association.

[4]  D. Lindberg,et al.  The Unified Medical Language System , 1993, Methods of Information in Medicine.

[5]  Olivier Bodenreider,et al.  An Object-oriented Model for Representing Semantic Locality in the UMLS , 2001, MedInfo.

[6]  James Geller,et al.  Partitioning the UMLS semantic network , 2002, IEEE Transactions on Information Technology in Biomedicine.

[7]  Olivier Bodenreider,et al.  Exploring semantic groups through visual approaches , 2003, J. Biomed. Informatics.

[8]  James Geller,et al.  The cohesive metaschema: a higher-level abstraction of the UMLS Semantic Network , 2002, J. Biomed. Informatics.

[9]  A. McCray,et al.  Yearbook of Medical Informatics , 2013, Yearbook of Medical Informatics.

[10]  Olivier Bodenreider,et al.  Aggregating UMLS Semantic Types for Reducing Conceptual Complexity , 2001, MedInfo.

[11]  James J. Cimino,et al.  Research Paper: Auditing the Unified Medical Language System with Semantic Methods , 1998, J. Am. Medical Informatics Assoc..

[12]  James J. Cimino Battling Scylla and Charybdis: the search for redundancy and ambiguity in the 2001 UMLS metathesaurus , 2001, AMIA.

[13]  James Geller,et al.  Auditing the UMLS for redundant classifications , 2002, AMIA.

[14]  Betsy L. Humphreys,et al.  Technical Milestone: The Unified Medical Language System: An Informatics Research Collaboration , 1998, J. Am. Medical Informatics Assoc..

[15]  David I. Schneider,et al.  Finite Mathematics and Its Applications , 1969 .

[16]  D. Lindberg,et al.  Building the Unified Medical Language System , 1989 .

[17]  Alexa T. McCray,et al.  Representing biomedical knowledge in the UMLS semantic network , 1993 .