Measuring Semantic Similarity in the Taxonomy of WordNet

This paper presents a new model to measure semantic similarity in the taxonomy of WordNet, using edge-counting techniques. We weigh up our model against a benchmark set by human similarity judgment, and achieve a much improved result compared with other methods: the correlation with average human judgment on a standard 28 word pair dataset is 0.921, which is better than anything reported in the literature and also significantly better than average individual human judgments. As this set has been effectively used for algorithm selection and tuning, we also cross-validate an independent 37 word pair test set (0.876) and present results for the full 65 word pair superset (0.897).

[1]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[2]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[3]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[4]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[5]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[6]  Christiane Fellbaum,et al.  Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms , 1998 .

[7]  Dekang Lin,et al.  Using Syntactic Dependency as Local Context to Resolve Word Sense Ambiguity , 1997, ACL.

[8]  G. Zipf,et al.  Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. , 1949 .

[9]  Eric Atwell,et al.  A lexical database for English learners and users: the Oxford advanced learner's dictionary , 1989 .

[10]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[11]  M. Ross Quillian,et al.  Retrieval time from semantic memory , 1969 .

[12]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[13]  A. Tversky Features of Similarity , 1977 .

[14]  M R Quillian,et al.  Word concepts: a theory and simulation of some basic semantic capabilities. , 1967, Behavioral science.

[15]  Rada Mihalcea,et al.  Using WordNet and Lexical Operators to Improve Internet Searches , 2000, IEEE Internet Comput..

[16]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[17]  Christiane Fellbaum,et al.  Combining Local Context and Wordnet Similarity for Word Sense Identification , 1998 .

[18]  Stan Szpakowicz,et al.  Roget's thesaurus and semantic similarity , 2012, RANLP.