Research Paper: Integrating SNOMED CT into the UMLS: An Exploration of Different Views of Synonymy and Quality of Editing

OBJECTIVE The integration of SNOMED CT into the Unified Medical Language System (UMLS) involved the alignment of two views of synonymy that were different because the two vocabulary systems have different intended purposes and editing principles. The UMLS is organized according to one view of synonymy, but its structure also represents all the individual views of synonymy present in its source vocabularies. Despite progress in knowledge-based automation of development and maintenance of vocabularies, manual curation is still the main method of determining synonymy. The aim of this study was to investigate the quality of human judgment of synonymy. DESIGN Sixty pairs of potentially controversial SNOMED CT synonyms were reviewed by 11 domain vocabulary experts (six UMLS editors and five noneditors), and scores were assigned according to the degree of synonymy. MEASUREMENTS The synonymy scores of each subject were compared to the gold standard (the overall mean synonymy score of all subjects) to assess accuracy. Agreement between UMLS editors and noneditors was measured by comparing the mean synonymy scores of editors to noneditors. RESULTS Average accuracy was 71% for UMLS editors and 75% for noneditors (difference not statistically significant). Mean scores of editors and noneditors showed significant positive correlation (Spearman's rank correlation coefficient 0.654, two-tailed p < 0.01) with a concurrence rate of 75% and an interrater agreement kappa of 0.43. CONCLUSION The accuracy in the judgment of synonymy was comparable for UMLS editors and nonediting domain experts. There was reasonable agreement between the two groups.

[1]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[2]  J. Fleiss,et al.  Statistical methods for rates and proportions , 1973 .

[3]  Marsden S. Blois The Effect of Hierarchy on the Encoding of Meaning. , 1986 .

[4]  M. S. Blois Medicine and the nature of vertical reasoning. , 1988, The New England journal of medicine.

[5]  D. Lindberg,et al.  Unified Medical Language System , 2020, Definitions.

[6]  Mark A. Musen,et al.  Research Paper: A Logical Foundation for Representation of Clinical Data , 1994, J. Am. Medical Informatics Assoc..

[7]  George Hripcsak,et al.  Research Paper: Knowledge-based Approaches to the Maintenance of a Large Controlled Medical Terminology , 1994, J. Am. Medical Informatics Assoc..

[8]  A T McCray,et al.  The Representation of Meaning in the UMLS , 1995, Methods of Information in Medicine.

[9]  Kent A. Spackman,et al.  SNOMED RT: a reference terminology for health care , 1997, AMIA.

[10]  Charles P. Friedman,et al.  Evaluation Methods in Medical Informatics , 1997, Computers and Medicine.

[11]  Charles P. Friedman,et al.  Basics of Measurement , 1997 .

[12]  Victor Maojo,et al.  A Concept Model for the Automatic Maintenance of Controlled Medical Vocabularies , 1998, MedInfo.

[13]  J. Cimino Desiderata for Controlled Medical Vocabularies in the Twenty-First Century , 1998, Methods of Information in Medicine.

[14]  Christopher G. Chute,et al.  A clinical terminology in the post modern era: pragmatic problem list development , 1998, AMIA.

[15]  Kent A. Spackman,et al.  Review: Representing Thoughts, Words, and Things in the UMLS , 1998, J. Am. Medical Informatics Assoc..

[16]  James J. Cimino,et al.  Review: From Data to Knowledge through Concept-oriented Terminologies: Experience with the Medical Entities Dictionary , 2000, J. Am. Medical Informatics Assoc..

[17]  William T. Hole,et al.  Discovering missed synonymy in a large concept-oriented Metathesaurus , 2000, AMIA.

[18]  A. Roselle RAND Corporation Web Site , 2000 .

[19]  Shelly Nash,et al.  Nonsynonymous Synonyms: Correcting and Improving SNOMED CT® , 2003, AMIA.

[20]  J. Fleiss,et al.  The measurement of interrater agreement , 2004 .

[21]  Werner Ceusters,et al.  Ontology-Based Error Detection in SNOMED-CT® , 2004, MedInfo.

[22]  Betsy L. Humphreys,et al.  Achieving "Source Transparency" in the UMLS® Metathesaurus® , 2004, MedInfo.

[23]  Jason Eisner,et al.  Lexical Semantics , 2020, The Handbook of English Linguistics.