OntoNotes: Corpus Cleanup of Mistaken Agreement Using Word Sense Disambiguation

Annotated corpora are only useful if their annotations are consistent. Most large-scale annotation efforts take special measures to reconcile inter-annotator disagreement. To date, however, no-one has investigated how to automatically determine exemplars in which the annotators agree but are wrong. In this paper, we use OntoNotes, a large-scale corpus of semantic annotations, including word senses, predicate-argument structure, ontology linking, and coreference. To determine the mistaken agreements in word sense annotation, we employ word sense disambiguation (WSD) to select a set of suspicious candidates for human evaluation. Experiments are conducted from three aspects (precision, cost-effectiveness ratio, and entropy) to examine the performance of WSD. The experimental results show that WSD is most effective on identifying erroneous annotations for highly-ambiguous words, while a baseline is better for other cases. The two methods can be combined to improve the cleanup process. This procedure allows us to find approximately 2% remaining erroneous agreements in the OntoNotes corpus. A similar procedure can be easily defined to check other annotated corpora.

[1]  Hwee Tou Ng,et al.  An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation , 2002, EMNLP.

[2]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[3]  Yee Whye Teh,et al.  Improving Word Sense Disambiguation Using Topic Features , 2007, EMNLP.

[4]  Rie Kubota Ando,et al.  Applying Alternating Structure Optimization to Word Sense Disambiguation , 2006, CoNLL.

[5]  Adam Kilgarriff,et al.  English Lexical Sample Task Description , 2001, *SEMEVAL.

[6]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[7]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[8]  Christian Posse,et al.  PNNL: A Supervised Maximum Entropy Approach to Word Sense Disambiguation , 2007, SemEval@ACL.

[9]  Hwee Tou Ng,et al.  Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach , 1996, ACL.

[10]  Eneko Agirre,et al.  UBC-ALM: Combining k-NN with SVD for WSD , 2007, SemEval@ACL.

[11]  Mitchell P. Marcus,et al.  OntoNotes: A Unified Relational Semantic Representation , 2007, International Conference on Semantic Computing (ICSC 2007).

[12]  Olga Babko-Malaya,et al.  Different Sense Granularities for Different Applications , 2004, HLT-NAACL 2004.

[13]  Chung-Hsien Wu,et al.  OntoNotes: Sense Pool Verification Using Google N-gram and Statistical Tests , 2007 .

[14]  Jingbo Zhu,et al.  Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem , 2007, EMNLP.

[15]  Daniel Jurafsky,et al.  Learning to Merge Word Senses , 2007, EMNLP.

[16]  Lucia Specia,et al.  Learning Expressive Models for Word Sense Disambiguation , 2007, ACL.

[17]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[18]  M. A. R T H A P A L,et al.  Making fine-grained and coarse-grained sense distinctions , both manually and automatically , 2005 .

[19]  Rada Mihalcea,et al.  Using Wikipedia for Automatic Word Sense Disambiguation , 2007, NAACL.

[20]  Rada Mihalcea,et al.  Building a Sense Tagged Corpus with Open Mind Word Expert , 2002, SENSEVAL.

[21]  Adam Kilgarriff,et al.  Special issue on SENSEVAL: Evaluating word sense disambiguation programs , 2000 .

[22]  I. D. Melamed Measuring Semantic Entropy , 1997 .