Investigations on Word Senses and Word Usages

The vast majority of work on word senses has relied on predefined sense inventories and an annotation schema where each word instance is tagged with the best fitting sense. This paper examines the case for a graded notion of word meaning in two experiments, one which uses WordNet senses in a graded fashion, contrasted with the "winner takes all" annotation, and one which asks annotators to judge the similarity of two usages. We find that the graded responses correlate with annotations from previous datasets, but sense assignments are used in a way that weakens the case for clear cut sense boundaries. The responses from both experiments correlate with the overlap of paraphrases from the English lexical substitution task which bodes well for the use of substitutes as a proxy for word sense. This paper also provides two novel datasets which can be used for evaluating computational systems.

[1]  Serge Sharoff,et al.  Open-source Corpora: Using the net to fish for linguistic data , 2006 .

[2]  Roberto Navigli,et al.  SemEval-2007 Task 10: English Lexical Substitution Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[3]  Christopher Stokoe Differentiating Homonymy and Polysemy in Information Retrieval , 2005, HLT/EMNLP.

[4]  Martha Palmer,et al.  Improving English verb sense disambiguation performance with linguistically motivated features and clear sense distinction boundaries , 2009, Lang. Resour. Evaluation.

[5]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[6]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[7]  ResnikPhilip,et al.  Distinguishing systems and distinguishing senses: new evaluation methods for Word Sense Disambiguation , 1999 .

[8]  Joan L. Bybee,et al.  A Usage-based Approach to Spanish Verbs of 'Becoming' , 2006 .

[9]  Yorick Wilks,et al.  Making Sense About Sense , 2007 .

[10]  Roberto Navigli,et al.  SemEval-2007 Task 07: Coarse-Grained English All-Words Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[11]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[12]  Christiane Fellbaum,et al.  Building Semantic Concordances , 1998 .

[13]  E. Rosch,et al.  Family resemblances: Studies in the internal structure of categories , 1975, Cognitive Psychology.

[14]  Katrin Erk,et al.  A Structured Vector Space Model for Word Meaning in Context , 2008, EMNLP.

[15]  Mirella Lapata,et al.  Automatic Evaluation of Information Ordering: Kendall’s Tau , 2006, CL.

[16]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[17]  Adam Kilgarriff,et al.  "I Don’t Believe in Word Senses" , 1997, Comput. Humanit..

[18]  E. Rosch Cognitive Representations of Semantic Categories. , 1975 .

[19]  LapataMirella Automatic Evaluation of Information Ordering , 2006 .

[20]  Eneko Agirre,et al.  Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology) , 2006 .

[21]  Eneko Agirre,et al.  Semeval-2007 Task 2 : Evaluating Word Sense Induction and Discrimination , 2007 .

[22]  M. McCloskey,et al.  Natural categories: Well defined or fuzzy sets? , 1978 .

[23]  Adam Kilgarriff,et al.  Framework and Results for English SENSEVAL , 2000, Comput. Humanit..

[24]  Eneko Agirre,et al.  Proceedings of the 4th International Workshop on Semantic Evaluations , 2007 .

[25]  G. Murphy,et al.  The Big Book of Concepts , 2002 .

[26]  Patrick Hanks,et al.  Do Word Meanings Exist? , 2000, Comput. Humanit..

[27]  J. Hampton Polymorphous Concepts in Semantic Memory , 1979 .

[28]  Krister Lindén Word Senses , 2005 .

[29]  Eneko Agirre,et al.  Evaluating Word Sense Induction and Discrimination Systems , 2007 .

[30]  James A. Hampton,et al.  Typicality, Graded Membership, and Vagueness , 2007, Cogn. Sci..