Approaches to Eliminating Cycles in the UMLS Metathesaurus: Naïve vs. Formal

Applications exploiting the hierarchical relations recorded in the Unified Medical Language System (UMLS) Metathesaurus suffer from the presence of inconsistencies in these relations. A formal approach to identifying and eliminating circular hierarchical relations has been proposed in previous work, leading to the creation of a directed acyclic Metathesaurus graph. However, this approach is at best semi-automatic and its implementation is far from trivial. A simpler, alternative approach consists in avoiding loops while traversing the Metathesaurus graph by preventing nodes from being visited twice. Our objective is to evaluate the benefit of the formal approach to eliminating cycles over a naïve approach to avoiding them. To this end, we compared the size and semantic coherence of sets of descendants obtained by both approaches. 12% of the concepts with descendants exhibit some differences. The formal approach significantly reduces the number of descendants in these cases. The benefits in terms of semantic coherence are more subtle.