New ontology-based semantic similarity measure for the biomedical domain

The goal of this research is to propose a new ontology-based semantic similarity measure and apply it into the biomedical domain. We also apply the ontology-based semantic similarity measures from NLP into the biomedicine domain within the framework of the UMLS. The proposed measure is based on the path length between the concept nodes as well as the depth of the lcs node in the ontology hierarchy tree. The proposed similarity method was evaluated relative to human experts' ratings, and compared with the existing measures on sets of concepts using the MeSH terminology within the UMLS. The experimental results validate the efficiency of the proposed method, and demonstrate that our semantic similarity measure, compared with the existing techniques, gives the best overall results of correlation with experts' ratings.

[1]  A. Purandare,et al.  Semantic Relatedness Applied to All Words Sense Disambiguation Contents 1 Introduction 2 2 Measuring Semantic Relatedness 5 List of Figures List of Tables List of Algorithms , 2005 .

[2]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[3]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[4]  Douglas L. Crowson,et al.  Medical information retrieval and WWW browsers at Mayo. , 1995, Proceedings. Symposium on Computer Applications in Medical Care.

[5]  Aldo Gangemi,et al.  Unsupervised Learning of Semantic Relations between Concepts of a Molecular Biology Ontology , 2005, IJCAI.

[6]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[7]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[8]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[9]  Martin Chodorow,et al.  Combining local context and wordnet similarity for word sense identification , 1998 .

[10]  Siddharth Patwardhan,et al.  Incorporating Dictionary and Corpus Information into a Context Vector Measure of Semantic Relatednes , 2003 .

[11]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[12]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[13]  Ping Chen,et al.  Context-based similar words detection and its application in specialized search engines , 2005, IUI '05.

[14]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[15]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[16]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[17]  Minkoo Kim,et al.  Topic distillation using hierarchy concept tree , 2003, SIGIR '03.

[18]  Graeme Hirst,et al.  Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures , 2004 .

[19]  Peter D. Turney Measuring Semantic Similarity by Latent Relational Analysis , 2005, IJCAI.

[20]  R. Côté Systematized Nomenclature of Medicine , 1979 .

[21]  James J. Cimino,et al.  Towards the development of a conceptual distance metric for the UMLS , 2004, J. Biomed. Informatics.

[22]  Alan F. Smeaton,et al.  On the Use of MeSH Headings to Improve Retrieval Effectiveness , 2003, TREC.

[23]  Sophia Ananiadou,et al.  A Flexible Measure of Contextual Similarity for Biomedical Terms , 2004, Pacific Symposium on Biocomputing.

[24]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[25]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[26]  Ted Pedersen,et al.  Extended Gloss Overlaps as a Measure of Semantic Relatedness , 2003, IJCAI.