A structural approach to the automatic adjudication of word sense disagreements

Abstract The semantic annotation of texts with senses from a computational lexicon is a complex and often subjective task. As a matter of fact, the fine granularity of the WordNet sense inventory [Fellbaum, Christiane (ed.). 1998. WordNet: An Electronic Lexical Database MIT Press], a de facto standard within the research community, is one of the main causes of a low inter-tagger agreement ranging between 70% and 80% and the disappointing performance of automated fine-grained disambiguation systems (around 65% state of the art in the Senseval-3 English all-words task). In order to improve the performance of both manual and automated sense taggers, either we change the sense inventory (e.g. adopting a new dictionary or clustering WordNet senses) or we aim at resolving the disagreements between annotators by dealing with the fineness of sense distinctions. The former approach is not viable in the short term, as wide-coverage resources are not publicly available and no large-scale reliable clustering of WordNet senses has been released to date. The latter approach requires the ability to distinguish between subtle or misleading sense distinctions. In this paper, we propose the use of structural semantic interconnections – a specific kind of lexical chains – for the adjudication of disagreed sense assignments to words in context. The approach relies on the exploitation of the lexicon structure as a support to smooth possible divergencies between sense annotators and foster coherent choices. We perform a twofold experimental evaluation of the approach applied to manual annotations from the SemCor corpus, and automatic annotations from the Senseval-3 English all-words competition. Both sets of experiments and results are entirely novel: structural adjudication allows to improve the state-of-the-art performance in all-words disambiguation by 3.3 points (achieving a 68.5% F1-score) and attains figures around 80% precision and 60% recall in the adjudication of disagreements from human annotators.

[1]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[2]  Graeme Hirst,et al.  Lexical Cohesion Computed by Thesaural relations as an indicator of the structure of text , 1991, CL.

[3]  James Pustejovsky,et al.  The Generative Lexicon , 1995, CL.

[4]  Longman Longman Language Activator , 1993 .

[5]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[6]  William B. Dolan,et al.  Word Sense Ambiguation: Clustering Related Senses , 1994, COLING.

[7]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[8]  Eneko Agirre,et al.  Combining Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation , 1997, ACL.

[9]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[10]  Adam Kilgarriff,et al.  "I Don’t Believe in Word Senses" , 1997, Comput. Humanit..

[11]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[12]  George A. Miller,et al.  Using Corpus Statistics and WordNet Relations for Sense Identification , 1998, CL.

[13]  Wim Peters,et al.  Automatic sense clustering in eurowordnet , 1998, LREC.

[14]  Christiane Fellbaum,et al.  Performance And Confidence In A Semantic Annotation Task , 1998 .

[15]  Chung Yong Lim,et al.  A Case Study on Inter-Annotator Agreement for Word Sense Disambiguation , 1999 .

[16]  George A. Miller,et al.  WordNet 2 - A Morphologically and Semantically Enhanced Resource , 1999 .

[17]  Martha Palmer,et al.  Consistent Criteria for Sense Distinctions , 2000, Comput. Humanit..

[18]  Bernardo Magnini,et al.  Integrating Subject Field Codes into WordNet , 2000, LREC.

[19]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[20]  Patrick Hanks,et al.  Do Word Meanings Exist? , 2000, Comput. Humanit..

[21]  Dan Klein,et al.  Combining Heterogeneous Classifiers for Word Sense Disambiguation , 2001, SENSEVAL@ACL.

[22]  Jean Véronis,et al.  Sense tagging: does it make sense? , 2001 .

[23]  Yorick Wilks,et al.  The Interaction of Knowledge Sources in Word Sense Disambiguation , 2001, CL.

[24]  David Yarowsky,et al.  Combining Classifiers for word sense disambiguation , 2002, Natural Language Engineering.

[25]  Rada Mihalcea,et al.  Building a Sense Tagged Corpus with Open Mind Word Expert , 2002, SENSEVAL.

[26]  Adam Kilgarriff,et al.  Introduction to the special issue on evaluating word sense disambiguation systems , 2002, Natural Language Engineering.

[27]  Eneko Agirre,et al.  Clustering WordNet word senses , 2003, RANLP.

[28]  Kathleen McKeown,et al.  Improving Word Sense Disambiguation in Lexical Chaining , 2003, IJCAI.

[29]  Rada Mihalcea,et al.  Exploiting Agreement and Disagreement of Human Annotators for Word Sense Disambiguation , 2003 .

[30]  Walter Daelemans,et al.  GAMBL, genetic algorithm optimization of memory-based WSD , 2004, SENSEVAL@ACL.

[31]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[32]  Rada Mihalcea,et al.  PageRank on Semantic Networks, with Application to Word Sense Disambiguation , 2004, COLING.

[33]  Rada Mihalcea,et al.  SenseLearner: Minimally supervised Word Sense Disambiguation for all words in open text , 2004, SENSEVAL@ACL.

[34]  Jean Véronis,et al.  HyperLex: lexical cartography for information retrieval , 2004, Comput. Speech Lang..

[35]  Ken Litkowski Senseval-3 task: Word Sense Disambiguation of WordNet glosses , 2004, SENSEVAL@ACL.

[36]  Emanuele Pianta,et al.  Evaluating Cross-Language Annotation Transfer in the MultiSemCor Corpus , 2004, COLING.

[37]  Deniz Yuret Some experiments with a Naive Bayes WSD system , 2004, SENSEVAL@ACL.

[38]  Paola Velardi,et al.  Structural semantic interconnections: a knowledge-based approach to word sense disambiguation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Roberto Navigli,et al.  Semi-Automatic Extension of Large-Scale Linguistic Knowledge Bases , 2005, FLAIRS.

[40]  H. Dang,et al.  Making fine-grained and coarse-grained sense distinctions, both manually and automatically , 2006, Natural Language Engineering.

[41]  Montse Cuadros,et al.  Quality Assessment of Large Scale Knowledge Resources , 2006, EMNLP.

[42]  Roberto Navigli Experiments on the Validation of Sense Annotations Assisted by Lexical Chains , 2006, EACL.

[43]  Roberto Navigli Consistent Validation of Manual and Automatic Sense Annotations with the Aid of Semantic Graphs , 2006, Computational Linguistics.

[44]  Eneko Agirre,et al.  Two graph-based algorithms for state-of-the-art WSD , 2006, EMNLP.

[45]  Mirella Lapata,et al.  Ensemble Methods for Unsupervised WSD , 2006, ACL.

[46]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[47]  Roberto Navigli,et al.  Meaningful Clustering of Senses Helps Boost Word Sense Disambiguation Performance , 2006, ACL.

[48]  Christiane Fellbaum,et al.  Making fine-grained and coarse-grained sense distinctions, both manually and automatically , 2006, Natural Language Engineering.

[49]  Roberto Navigli,et al.  SemEval-2007 Task 07: Coarse-Grained English All-Words Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[50]  Wee Sun Lee,et al.  Optimizing Classifier Performance in Word Sense Disambiguation by Redefining Sense Classes , 2007, IJCAI.

[51]  Judy Pearsall,et al.  Oxford Dictionary of English , 2010 .