Assessment of semantic similarity of concepts defined in ontology

Abstract Enormous amount of available information on the Web creates a demand for automatic ways of processing and analyzing data. One of the most common activities performed by these processes is comparison of data – it is done to find something new or confirm things we already know. In each case there is a need for determining similarity between different objects and pieces of information. The process of determining similarity seems to be relatively easy when it is done for a numerical data, but it is not so in case of a symbolic data. At the same time, the development of Web technologies has led to the introduction of XML-based formats of data representation on the Web, Resource Description Framework (RDF) and ontology. This paper proposes a method for determining semantic similarity between concepts defined in ontology. In contrast to other techniques that use ontological definition of concepts for similarity assessment, the proposed approach focuses on the relations between concepts and their semantics. The presented method is able to determine similarity not only at the definition/abstract level, but also is able to evaluate similarity of concrete pieces of information that are instances of concepts. In addition, the method allows for context-aware similarity assessment when only specific sets of relations, identified by the context, are taken into consideration. Experimental comparison of our similarity assessment approach against other techniques known in the literature shows satisfying results.

[1]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[2]  Esko Ukkonen,et al.  Algorithms for Approximate String Matching , 1985, Inf. Control..

[3]  Kalina Bontcheva,et al.  Developing Language Processing Components with GATE (a User Guide) , 2003 .

[4]  Erhard Rahm,et al.  COMA - A System for Flexible Combination of Schema Matching Approaches , 2002, VLDB.

[5]  Erich J. Neuhold,et al.  Semantic vs. structural resemblance of classes , 1991, SGMD.

[6]  K. Chidananda Gowda,et al.  Divisive clustering of symbolic objects using the concepts of both similarity and dissimilarity , 1995, Pattern Recognit..

[7]  Fausto Giunchiglia,et al.  Semantic Matching: Algorithms and Implementation , 2007, J. Data Semant..

[8]  Khaled Mellouli,et al.  A New Similarity Measure Based On Edge Counting , 2008 .

[9]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[10]  G Stix,et al.  The mice that warred. , 2001, Scientific American.

[11]  Erhard Rahm,et al.  Schema and ontology matching with COMA++ , 2005, SIGMOD '05.

[12]  Marek Reformat,et al.  Feature-based similarity assessment in ontology using fuzzy set theory , 2012, 2012 IEEE International Conference on Fuzzy Systems.

[13]  Lotfi A. Zadeh,et al.  Similarity relations and fuzzy orderings , 1971, Inf. Sci..

[14]  Thomas R. Gruber,et al.  A translation approach to portable ontology specifications , 1993 .

[15]  Ted Briscoe,et al.  32nd Annual Meeting of the Association for Computational Linguistics, 27-30 June 1994, New Mexico State University, Las Cruces, New Mexico, USA, Proceedings , 1994, ACL.

[16]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[17]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[18]  A. Tversky Features of Similarity , 1977 .

[19]  Christian Biemann,et al.  Ontology Learning from Text: A Survey of Methods , 2005, LDV Forum.

[20]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[21]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[22]  M. Junaid Arshad,et al.  A Layered approach for Similarity Measurement between Ontologies , 2010 .

[23]  Ted Pedersen,et al.  Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts , 2006 .

[24]  Matthias Klusch,et al.  Larks: Dynamic Matchmaking Among Heterogeneous Software Agents in Cyberspace , 2002, Autonomous Agents and Multi-Agent Systems.

[25]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[26]  K. Chidananda Gowda,et al.  Symbolic clustering using a new similarity measure , 1992, IEEE Trans. Syst. Man Cybern..

[27]  Patrick A. V. Hall,et al.  Approximate String Matching , 1994, Encyclopedia of Algorithms.