Unifying ontological similarity measures: A theoretical and empirical investigation

Abstract This paper theoretically and empirically investigates ontological similarity. Tversky’s parameterized ratio model of similarity [3] is shown as a unifying basis for many of the well-known ontological similarity measures. A new family of ontological similarity measures is proposed that allows parameterizing the characteristic set used to represent an ontological concept. The three subontologies of the prominent Gene Ontology (GO) are used in an empirical investigation of several ontological similarity measures. Another study using well known semantic similarity within two different anatomy ontologies, the NCIT anatomy and the mouse anatomy, is also presented for comparison to several of the GO results. A discussion of the correlation among the measures is presented as well as a comparison of the effects of two different methods of determining a concept’s information content, corpus-based and ontology-based.

[1]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[2]  F ATTNEAVE,et al.  Dimensions of similarity. , 1950, The American journal of psychology.

[3]  Jérôme Euzenat,et al.  A Feature and Information Theoretic Framework for Semantic Similarity and Relatedness , 2010, SEMWEB.

[4]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[5]  V. Cross,et al.  Similarity and Compatibility in Fuzzy Set Theory: Assessment And Applications , 2010 .

[6]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[7]  Maya R. Gupta,et al.  Information-theoretic and Set-theoretic Similarity , 2006, 2006 IEEE International Symposium on Information Theory.

[8]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[9]  A. Tversky Features of Similarity , 1977 .

[10]  V. Cross,et al.  Tversky's Parameterized Similarity Ratio Model: A Basis for Semantic Relatedness , 2006, NAFIPS 2006 - 2006 Annual Meeting of the North American Fuzzy Information Processing Society.

[11]  Graeme Hirst,et al.  Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures , 2004 .

[12]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[13]  Tony Veale,et al.  An Intrinsic Information Content Metric for Semantic Similarity in WordNet , 2004, ECAI.

[14]  Marc Ehrig,et al.  Ontology Alignment: Bridging the Semantic Gap , 2006 .

[15]  D. Gentner,et al.  Respects for similarity , 1993 .

[16]  Max J. Egenhofer,et al.  Determining Semantic Similarity among Entity Classes from Different Ontologies , 2003, IEEE Trans. Knowl. Data Eng..

[17]  Valerie V. Cross,et al.  Investigating Ontological Similarity Theoretically with Fuzzy Set Theory, Information Content, and Tversky Similarity and Empirically with the Gene Ontology , 2011, SUM.

[18]  Yi Sun,et al.  Semantic, Fuzzy Set and Fuzzy Measure Similarity for the Gene Ontology , 2007, 2007 IEEE International Fuzzy Systems Conference.

[19]  Carole A. Goble,et al.  Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation , 2003, Bioinform..

[20]  Nuno Seco,et al.  Design, Implementation and Evaluation of a New Semantic Similarity Metric Combining Features and Intrinsic Information Content , 2008, OTM Conferences.

[21]  Valerie V. Cross,et al.  Using semantic similarity in ontology alignment , 2011, OM.

[22]  Heiner Stuckenschmidt,et al.  Results of the Ontology Alignment Evaluation Initiative 2007 , 2006, OM.

[23]  Heiner Stuckenschmidt,et al.  Results of the Ontology Alignment Evaluation Initiative , 2007 .