Word-Sense Distinguishability and Inter-Coder Agreement

It. is common in NLP that the categories into which text is classified do not have fully objective definitions. Examples of such categories are lexical distinctions such as part-of-speech tags and wordsense distinctions, sentence level distinctions such as phrase attachment, and discourse level distinct.icms such as topic or speech-act categorization. This p>1per presents an approach to analy?-ing the agrcen1ent arnong lnnnan judges for the purpose of formulating a refined and more reliable set of category designations. We use these techniques to analyze the sense tags assigned by five judgps to the noun intcr·est. The initial tag set is takmi from Longman's Dictionary of Contemporary }i:nglish. Through this process of analysis, we automatically identify and assign a revised set of sense tags for the data. The revised tags exhibit high reliability as measured by Cohen's r;.. Such techniques are important for formulating and evaluating both human and automated classification systems.

[1]  David Yarowsky,et al.  Word-Sense Disambiguation Using Statistical Models of Roget’s Categories Trained on Large Corpora , 2010, COLING.

[2]  L. A. Goodman Exploratory latent structure analysis using both identifiable and unidentifiable models , 1974 .

[3]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[4]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[5]  F. Krauss Latent Structure Analysis , 1980 .

[6]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[7]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[8]  Roberto Basili,et al.  Towards a Bootstrapping Framework for Corpus Semantic Tagging , 1997 .

[9]  Janyce Wiebe,et al.  An Empirical Approach to Temporal Reference Resolution , 1997, EMNLP.

[10]  Rebecca J. Passonneau,et al.  Combining Multiple Knowledge Sources for Discourse Segmentation , 1995, ACL.

[11]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[12]  Julia Hirschberg,et al.  A Prosodic Analysis of Discourse Segments in Direction-Giving Monologues , 1996, ACL.

[13]  J. Darroch,et al.  Category Distinguishability and Observer Agreement , 1986 .

[14]  P. Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[15]  Johanna D. Moore,et al.  Investigating Cue Selection and Placement in Tutorial Discourse , 1995, ACL.

[16]  Kathleen R. McKeown,et al.  Investigating Complementary Methods for Verb Sense Pruning , 1997 .

[17]  Peter C. Cheeseman,et al.  Bayesian Classification (AutoClass): Theory and Results , 1996, Advances in Knowledge Discovery and Data Mining.

[18]  Ted Pedersen,et al.  Distinguishing Word Senses in Untagged Text , 1997, EMNLP.

[19]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.