Evaluation of the UMLS as a terminology and knowledge resource for biomedical informatics

OBJECTIVES Terminology and knowledge resources are essential components of interoperability among disparate systems. This paper evaluates whether names and relationships needed in biomedical informatics are present in the UMLS. METHODS Terms for five broad categories of concepts were extracted from LocusLink and mapped to the UMLS Metathesausus. Relationships between gene products and the other four categories (phenotype, molecular function, biological process, and cellular component) were searched for in the Metathesaurus. All gene products in the Gene Ontology database were also mapped to the UMLS in order to evaluate its global coverage of the domain. RESULTS The coverage of concepts ranged from 2% (gene product symbols) to 44% (molecular functions). The coverage of relationships ranged from 60% for Gene product-Biological process to 83% for Gene product-Molecular function. DISCUSSION Terminology and ontology issues are discussed, as well as the need for integrating additional resources to the UMLS.