COSS: Cross Ontology Semantic Similarity measure — An information content based approach

Computation of Semantic similarity between concepts play a key role in Ontology mapping, Psycholinguistics, Information Integration and Information Retrieval. A COSS (Cross Ontology Semantic Similarity) measure which follows information content approach and is based on Amos Tversky psychological contrast model for finding the semantic closeness of concepts belonging to different biomedical ontologies. This computational approach exploits knowledge sources such as ontologies, thesauri to quantify the information content (informativeness) which helps to assess the amount of information shared by the compared concepts. The proposed approach is corpus independent and it correlates well with the human judgements. The proposed approach has been experimented with two biomedical ontologies: SNOMED-CT (Systemized nomenclature of medical clinical terms) and Mesh (Medical subject headings) within UMLS Framework and the results are reported. This paper also proposed RRCOSS (Refined Resnik Cross Ontology Semantic Similarity) and RLCOSS (Refined Lin Cross Ontology Semantic Similarity) measures. The proposed three approaches outperform the other computational methods as it achieves the highest correlation of 0.920.

[1]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[2]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[3]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[4]  A. Tversky Features of Similarity , 1977 .

[5]  Michael Sussna,et al.  Word sense disambiguation for free-text indexing using a massive semantic network , 1993, CIKM '93.

[6]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[7]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[8]  Tony Veale,et al.  An Intrinsic Information Content Metric for Semantic Similarity in WordNet , 2004, ECAI.

[9]  Hisham Al-Mubaid,et al.  Measuring Semantic Similarity Between Biomedical Concepts Within Multiple Ontologies , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[10]  K. Saruladha,et al.  A New Semantic Similarity Metric for Solving Sparse Data Problem in Ontology based Information Retrieval System , 2010 .

[11]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[12]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[13]  Ted Pedersen,et al.  Measures of semantic similarity and relatedness in the biomedical domain , 2007, J. Biomed. Informatics.

[14]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[15]  Angelos Hliaoutakis,et al.  Semantic Similarity Measures in MeSH Ontology and their application to Information Retrieval on Medline , 2005 .

[16]  Nuno Seco,et al.  Design, Implementation and Evaluation of a New Semantic Similarity Metric Combining Features and Intrinsic Information Content , 2008, OTM Conferences.

[17]  Max J. Egenhofer,et al.  Determining Semantic Similarity among Entity Classes from Different Ontologies , 2003, IEEE Trans. Knowl. Data Eng..

[18]  Zuhair Bandar,et al.  Sentence similarity based on semantic nets and corpus statistics , 2006, IEEE Transactions on Knowledge and Data Engineering.

[19]  Boi Faltings,et al.  OSS: A Semantic Similarity Function based on Hierarchical Ontologies , 2007, IJCAI.

[20]  Jérôme Euzenat,et al.  A Feature and Information Theoretic Framework for Semantic Similarity and Relatedness , 2010, SEMWEB.