Making fine-grained and coarse-grained sense distinctions, both manually and automatically

In this paper we discuss a persistent problem arising from polysemy: namely the difficulty of finding consistent criteria for making fine-grained sense distinctions, either manually or automatically. We investigate sources of human annotator disagreements stemming from the tagging for the English Verb Lexical Sample Task in the Senseval-2 exercise in automatic Word Sense Disambiguation. We also examine errors made by a high-performing maximum entropy Word Sense Disambiguation system we developed. Both sets of errors are at least partially reconciled by a more coarse-grained view of the senses, and we present the groupings we use for quantitative coarse-grained evaluation as well as the process by which they were created. We compare the system’s performance with our human annotator performance in light of both fine-grained and coarse-grained sense distinctions and show that well-defined sense groups can be of value in improving word sense disambiguation by both humans and machines.

[1]  Christiane Fellbaum,et al.  Analysis of a Hand-Tagging Task , 1997, Workshop On Tagging Text With Lexical Semantics: Why, What, And How?.

[2]  Adam Kilgarriff,et al.  "I Don’t Believe in Word Senses" , 1997, Comput. Humanit..

[3]  Louise Guthrie,et al.  Lexical Disambiguation using Simulated Annealing , 1992, COLING.

[4]  Nancy Ide,et al.  © 1999 Kluwer Academic Publishers. Printed in the Netherlands Cross-lingual Sense Determination: Can It Work? , 2022 .

[5]  Martha Palmer,et al.  Verb semantics for English-Chinese translation , 1995, Machine Translation.

[6]  Martha Palmer,et al.  Class-Based Construction of a Verb Lexicon , 2000, AAAI/IAAI.

[7]  D. Geeraerts Vagueness's puzzles, polysemy's vagaries , 1993 .

[8]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[9]  Patrick Hanks,et al.  Do Word Meanings Exist? , 2000, Comput. Humanit..

[10]  Martha Palmer,et al.  Investigating Regular Sense Extensions Based on Intersective Levin Classes , 1998, COLING-ACL.

[11]  Martha Palmer,et al.  Investigations into the role of lexical semantics in word sense disambiguation , 2004 .

[12]  Scott Cotton,et al.  SENSEVAL-2: Overview , 2001, *SEMEVAL.

[13]  Christiane Fellbaum,et al.  Performance And Confidence In A Semantic Annotation Task , 1998 .

[14]  James Pustejovsky,et al.  The Generative Lexicon , 1995, CL.

[15]  Walter Daelemans,et al.  Memory-Based Word Sense Disambiguation , 2000, Comput. Humanit..

[16]  William B. Dolan,et al.  Word Sense Ambiguation: Clustering Related Senses , 1994, COLING.

[17]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[18]  Leonard Talmy,et al.  Path to Realization: A Typology of Event Conflation , 1991 .

[19]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[20]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[21]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[22]  Martha Palmer,et al.  Customizing verb definitions for specific semantic domains , 1990, Machine Translation.

[23]  Adam Kilgarriff,et al.  English Lexical Sample Task Description , 2001, *SEMEVAL.

[24]  Richard M. Schwartz,et al.  Nymble: a High-Performance Learning Name-finder , 1997, ANLP.

[25]  Ramesh Krishnamurthy,et al.  Peeling an Onion: The Lexicographer's Experience ofManual Sense-Tagging , 2000, Comput. Humanit..

[26]  Martha Palmer,et al.  Using prepositions to extend a verb lexicon , 2004, HLT-NAACL 2004.

[27]  Adam Kilgarriff,et al.  Introduction to the Special Issue on SENSEVAL , 2000, Comput. Humanit..

[28]  George A. Miller,et al.  A Topical/Local Classifier for Word Sense Identification , 2000, Comput. Humanit..

[29]  Rada Mihalcea,et al.  Automatic generation of a coarse grained WordNet , 2001, HTL 2001.

[30]  G. Miller,et al.  Semantic networks of english , 1991, Cognition.

[31]  Mitchell P. Marcus,et al.  Maximum entropy models for natural language ambiguity resolution , 1998 .

[32]  Adam Kilgarriff,et al.  Framework and Results for English SENSEVAL , 2000, Comput. Humanit..

[33]  Christiane Fellbaum,et al.  English Tasks: All-Words and Verb Lexical Sample , 2001, *SEMEVAL.

[34]  Christiane Fellbaum,et al.  The Organization of Verbs and Verb Concepts in a Semantic Net , 1999 .

[35]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[36]  Martha Palmer,et al.  From TreeBank to PropBank , 2002, LREC.

[37]  Martha Palmer,et al.  Constraining Lexical Selection Across Languages Using TAGs , 1994, ArXiv.

[38]  David Yarowsky,et al.  Evaluating sense disambiguation across diverse parameter spaces , 2002, Natural Language Engineering.

[39]  Martha Palmer,et al.  Combining Contextual Features for Word Sense Disambiguation , 2002, SENSEVAL.

[40]  Patrick Hanks,et al.  Contextual dependency and lexical sets , 1996 .

[41]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[42]  Martha Palmer,et al.  Integrating compositional semantics into a verb lexicon , 2000, COLING.

[43]  Mark Stevenson,et al.  Introduction to the special issue on word sense disambiguation , 2004, Comput. Speech Lang..

[44]  David Yarowsky,et al.  Distinguishing systems and distinguishing senses: new evaluation methods for Word Sense Disambiguation , 1999, Natural Language Engineering.

[45]  Martha Palmer,et al.  Simple Features for Chinese Word Sense Disambiguation , 2002, COLING.

[46]  Nicoletta Calzolari,et al.  Senseval/Romanseval: The Framework for Italian , 2000, Comput. Humanit..

[47]  Olga Babko-Malaya,et al.  Different Sense Granularities for Different Applications , 2004, HLT-NAACL 2004.