How Similar Is It? Towards Personalized Similarity Measures in Ontologies

Finding a good similarity assessment algorithm for the use in ontologies is central to the functioning of techniques such as retrieval, matchmaking, clustering, data-mining, ontology translations, automatic database schema matching, and simple object comparisons. This paper assembles a catalogue of ontology based similarity measures, which are experimentally compared with a “similarity gold standard” obtained by surveying 50 human subjects. Results show that human and algorithmic similarity predications varied substantially, but could be grouped into cohesive clusters. Addressing this variance we present a personalized similarity assessment procedure, which uses a machine learning component to predict a subject’s cluster membership, providing an excellent prediction of the gold standard. We conclude by hypothesizing ontology dependent similarity measures.

[1]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[2]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[3]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[4]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[5]  T. Andreasen,et al.  From Ontology over Similarity to Query Evaluation , 2003 .

[6]  Graeme Hirst,et al.  Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures , 2004 .

[7]  Mark Klein,et al.  Object Similarity in Ontologies: A Foundation for Business Intelligence Systems and High-Performance Retrieval , 2004, ICIS.

[8]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[9]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[10]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[11]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[12]  Benjamin Grosof,et al.  Sweetdeal: Representing Agent Contracts with Exceptions Using Xml Rules, Ontologies, and Process Descriptions , 2002 .

[13]  Mark Klein,et al.  Towards High-Precision Service Retrieval , 2002, SEMWEB.

[14]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[15]  Myoung-Ho Kim,et al.  Information Retrieval Based on Conceptual Distance in is-a Hierarchies , 1993, J. Documentation.

[16]  Daniel N. Osherson,et al.  Probability from similarity , 2003 .

[17]  Kevin Crowston,et al.  Organizing Business Knowledge: The MIT Process Handbook , 2003 .

[18]  Athman Bouguettaya,et al.  Efficient access to Web services , 2004, IEEE Internet Computing.

[19]  Kevin Crowston,et al.  Tools for inventing organizations: toward a handbook of organizational processes , 1993, [1993] Proceedings Second Workshop on Enabling Technologies@m_Infrastructure for Collaborative Enterprises.

[20]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[21]  Francesco M. Donini,et al.  A system for principled matchmaking in an electronic marketplace , 2003, WWW '03.

[22]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[23]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[24]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[25]  Stan Szpakowicz,et al.  Roget's thesaurus and semantic similarity , 2012, RANLP.

[26]  Bonnie L. Webber,et al.  Questions and Answers: Theoretical and Applied Perspectives , 2007, J. Appl. Log..

[27]  Max J. Egenhofer,et al.  Determining Semantic Similarity among Entity Classes from Different Ontologies , 2003, IEEE Trans. Knowl. Data Eng..

[28]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[29]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[30]  D. Gentner,et al.  Similarity and the development of rules , 1998, Cognition.

[31]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[32]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[33]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[34]  Mark Klein,et al.  Massachusetts Institute of Technology Abraham Bernstein University of Zurich Toward High-Precision Service Retrieval , 2022 .