Customization of biomedical terminologies

Within the biomedical area over one hundred terminologies exist and are merged in the Unified Medical Language System Metathesaurus, which gives over 1 million concepts. When such huge terminological resources are available, the users must deal with them and specifically they must deal with irrelevant parts of these terminologies. We propose to exploit seed terms and semantic distance algorithms in order to customize the terminologies and to limit within them a semantically homogeneous space. An evaluation performed by a medical expert indicates that the proposed approach is relevant for the customization of terminologies and that the extracted terms are mostly relevant to the seeds. It also indicates that different algorithms provide with similar or identical results within a given terminology. The difference is due to the terminologies exploited. A special attention must be paid to the definition of optimal association between the semantic similarity algorithms and the thresholds specific to a given terminology.

[1]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[2]  Enrico Motta,et al.  Engineering and Customizing Ontologies , 2008, Ontology Management.

[3]  Philip Resnik,et al.  Disambiguating Noun Groupings with Respect to Wordnet Senses , 1995, VLC@ACL.

[4]  F B ROGERS,et al.  Medical Subject Headings , 1948, Nature.

[5]  Natalia Grabar,et al.  Exploitation of semantic similarity for adaptation of existing terminologies within biomedical area , 2010 .

[6]  Kent A. Spackman,et al.  SNOMED clinical terms: overview of the development process and project status , 2001, AMIA.

[7]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[8]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[9]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[10]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[11]  Mathieu d'Aquin,et al.  Ontology Modularization for Knowledge Selection: Experiments and Evaluations , 2007, DEXA.

[12]  Yi Zhang,et al.  Novelty and redundancy detection in adaptive filtering , 2002, SIGIR '02.

[13]  Dina Demner-Fushman,et al.  Application of Information Technology: Essie: A Concept-based Search Engine for Structured Biomedical Text , 2007, J. Am. Medical Informatics Assoc..

[14]  Jeff Z. Pan,et al.  Forgetting Concepts in DL-Lite , 2008, ESWC.

[15]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[16]  M. Dzbor,et al.  The Human-Computer Challenge in Ontology Engineering , 2007 .

[17]  Timothy W. Finin,et al.  Mining Domain Specific Texts and Glossaries to Evaluate and Enrich Domain Ontologies , 2004, IKE.

[18]  Stefano Spaccapietra,et al.  Modular Ontologies: Concepts, Theories and Techniques for Knowledge Modularization , 2009, Modular Ontologies.

[19]  Michael Sussna,et al.  Word sense disambiguation for free-text indexing using a massive semantic network , 1993, CIKM '93.

[20]  Martin Chodorow,et al.  Combining local context and wordnet similarity for word sense identification , 1998 .

[21]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[22]  Johanna Völker,et al.  A Kernel Revision Operator for Terminologies - Algorithms and Evaluation , 2008, International Semantic Web Conference.

[23]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[24]  Nicola Guarino UNDERSTANDING, BUILDING, AND USING ONTOLOGIES , 1997 .

[25]  Ted Pedersen,et al.  UMLS-Interface and UMLS-Similarity : Open Source Software for Measuring Paths and Semantic Similarity , 2009, AMIA.