Evaluating Semantic Metrics on Tasks of Concept Similarity

This study presents an evaluation of WordNet-based semantic similarity and relatedness measures in tasks focused on concept similarity. Assuming similarity as distinct from relatedness, the goal is to fill a gap within the current body of work in the evaluation of similarity and relatedness measures. Past studies have either focused entirely on relatedness or only evaluated judgments over words rather than concepts. In this study, first, concept similarity measures are evaluated over human judgments by using existing sets of word similarity pairs that we annotated with word senses. Next, an application-oriented study is presented by integrating similarity and relatedness measures into an algorithm which relies on concept similarity. Interestingly, the results find metrics categorized as measuring relatedness to be strongest in correlation with human judgments of concept similarity, though the difference in correlation is small. On the other hand, an information content metric, categorized as measuring similarity, is notably strongest according to the application-oriented evaluation.

[1]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[2]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[3]  Arthur C. Graesser,et al.  Assessing Student Paraphrases Using Lexical Semantics and Word Weighting , 2009, AIED.

[4]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[5]  Peter D. Turney Similarity of Semantic Relations , 2006, CL.

[6]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[7]  Ted Pedersen,et al.  Using WordNet-based Context Vectors to Estimate the Semantic Relatedness of Concepts , 2006 .

[8]  George A. Miller,et al.  Using Corpus Statistics and WordNet Relations for Sense Identification , 1998, CL.

[9]  Tapio Seppänen,et al.  Digital Audio Watermarking Techniques and Technologies: Applications and Benchmarks , 2007 .

[10]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[11]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[12]  Roy Rada,et al.  Development and application of a metric on semantic nets , 1989, IEEE Trans. Syst. Man Cybern..

[13]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[14]  Fernando Gomez,et al.  Acquiring Knowledge from the Web to be used as Selectors for Noun Sense Disambiguation , 2008, CoNLL.

[15]  Sivaji Bandyopadhyay,et al.  Emerging Applications of Natural Language Processing: Concepts and New Research , 2012 .

[16]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[17]  Graeme Hirst,et al.  Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures , 2004 .

[18]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[19]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[20]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[21]  Karl Kristoffer Jensen On the Inherent Segment Length in Music , 2011 .

[22]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[23]  David M. W. Powers,et al.  Verb similarity on the taxonomy of WordNet , 2006 .

[24]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[25]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[26]  Wenwu Wang,et al.  Machine Audition: Principles, Algorithms and Systems , 2010 .

[27]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[28]  Lyne Da Sylva,et al.  NLP and Digital Library Management , 2013 .

[29]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[30]  Ted Pedersen,et al.  Using Measures of Semantic Relatedness for Word Sense Disambiguation , 2003, CICLing.