A Large-Scale Pseudoword-Based Evaluation Framework for State-of-the-Art Word Sense Disambiguation

The evaluation of several tasks in lexical semantics is often limited by the lack of large amounts of manual annotations, not only for training purposes, but also for testing purposes. Word Sense Disambiguation (WSD) is a case in point, as hand-labeled datasets are particularly hard and time-consuming to create. Consequently, evaluations tend to be performed on a small scale, which does not allow for in-depth analysis of the factors that determine a systems' performance.In this paper we address this issue by means of a realistic simulation of large-scale evaluation for the WSD task. We do this by providing two main contributions: First, we put forward two novel approaches to the wide-coverage generation of semantically aware pseudowords (i.e., artificial words capable of modeling real polysemous words); second, we leverage the most suitable type of pseudoword to create large pseudosense-annotated corpora, which enable a large-scale experimental framework for the comparison of state-of-the-art supervised and knowledge-based algorithms. Using this framework, we study the impact of supervision and knowledge on the two major disambiguation paradigms and perform an in-depth analysis of the factors which affect their performance.

[1]  Simone Paolo Ponzetto,et al.  Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems , 2010, ACL.

[2]  Roberto Navigli,et al.  From senses to texts: An all-in-one graph-based approach for measuring semantic similarity , 2015, Artif. Intell..

[3]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[4]  Hwee Tou Ng,et al.  Scaling Up Word Sense Disambiguation via Parallel Texts , 2005, AAAI.

[5]  Ken Litkowski Senseval-3 task: Word Sense Disambiguation of WordNet glosses , 2004, SENSEVAL@ACL.

[6]  Rada Mihalcea,et al.  Using Wikipedia for Automatic Word Sense Disambiguation , 2007, NAACL.

[7]  Xinglong Wang,et al.  Word Sense Disambiguation Using Sense Examples Automatically Acquired from a Second Language , 2005, HLT.

[8]  Randy Goebel,et al.  Discriminative Learning of Selectional Preference from Unlabeled Text , 2008, EMNLP.

[9]  Pavel Smrz,et al.  A New Approach to Pseudoword Generation , 2010, LREC.

[10]  Roberto Navigli A Quick Tour of Word Sense Disambiguation, Induction and Related Approaches , 2012, SOFSEM.

[11]  Mark Sanderson,et al.  The impact on retrieval effectiveness of skewed frequency distributions , 1999, TOIS.

[12]  Rada Mihalcea,et al.  An Automatic Method for Generating Sense Tagged Corpora , 1999, AAAI/IAAI.

[13]  Katrin Erk,et al.  A Simple, Similarity-based Model for Selectional Preferences , 2007, ACL.

[14]  Hwee Tou Ng,et al.  Word Sense Disambiguation with Distribution Estimation , 2005, IJCAI.

[15]  Roberto Navigli,et al.  Cross level semantic similarity: an evaluation framework for universal measures of similarity , 2015, Lang. Resour. Evaluation.

[16]  Michele Banko,et al.  Scaling to Very Very Large Corpora for Natural Language Disambiguation , 2001, ACL.

[17]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[18]  Rada Mihalcea,et al.  eXtended WordNet: progress report , 2001, HTL 2001.

[19]  H. Schütze,et al.  Dimensions of meaning , 1992, Supercomputing '92.

[20]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[21]  Eneko Agirre,et al.  Publicly Available Topic Signatures for all WordNet Nominal Senses , 2004, LREC.

[22]  Rada Mihalcea,et al.  Coarse to Fine Grained Sense Disambiguation in Wikipedia , 2013, *SEMEVAL.

[23]  Mitchell P. Marcus,et al.  OntoNotes: A Unified Relational Semantic Representation , 2007, International Conference on Semantic Computing (ICSC 2007).

[24]  David C. Wilkins,et al.  Readings in Knowledge Acquisition and Learning: Automating the Construction and Improvement of Expert Systems , 1992 .

[25]  Zhimao Lu,et al.  An Equivalent Pseudoword Solution to Chinese Word Sense Disambiguation , 2006, ACL.

[26]  Eduard H. Hovy,et al.  The Automated Acquisition of Topic Signatures for Text Summarization , 2000, COLING.

[27]  Simone Paolo Ponzetto,et al.  Collaboratively built semi-structured content and Artificial Intelligence: The story so far , 2013, Artif. Intell..

[28]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[29]  Hwee Tou Ng,et al.  Domain Adaptation with Active Learning for Word Sense Disambiguation , 2007, ACL.

[30]  David Yarowsky,et al.  One Sense per Collocation , 1993, HLT.

[31]  Nathanael Chambers,et al.  Improving the Use of Pseudo-Words for Evaluating Selectional Preferences , 2010, ACL.

[32]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[33]  Hwee Tou Ng,et al.  Word Sense Disambiguation with Semi-Supervised Learning , 2005, AAAI.

[34]  Lluís Màrquez i Villodre,et al.  An Empirical Study of the Domain Dependence of Supervised Word Disambiguation Systems , 2000, EMNLP.

[35]  David Martínez,et al.  Supervised Word Sense Disambiguation: Facing Current Challenges , 2005, Proces. del Leng. Natural.

[36]  Jordi Girona Salgado An Empirical Study of the Domain Dependence of Supervised Word Sense Disambiguation Systems , 2000 .

[37]  Montse Cuadros,et al.  Quality Assessment of Large Scale Knowledge Resources , 2006, EMNLP.

[38]  James R. Curran,et al.  Investigating GIS and Smoothing for Maximum Entropy Taggers , 2003, EACL.

[39]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[40]  Mirella Lapata,et al.  An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Marco Bertini,et al.  Semantic annotation and retrieval of video events using multimedia ontologies , 2007 .

[42]  Christiane Fellbaum,et al.  The Manually Annotated Sub-Corpus: A Community Resource for and by the People , 2010, ACL.

[43]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[44]  M. A. R T H A P A L,et al.  Making fine-grained and coarse-grained sense distinctions , both manually and automatically , 2005 .

[45]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[46]  Hwee Tou Ng,et al.  NUS-PT: Exploiting Parallel Texts for Word Sense Disambiguation in the English All-Words Tasks , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[47]  Julie Weeds,et al.  Finding Predominant Word Senses in Untagged Text , 2004, ACL.

[48]  Eduard Hovy,et al.  OntoNotes: A Unified Relational Semantic Representation , 2007 .

[49]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[50]  Roberto Navigli,et al.  Paving the Way to a Large-scale Pseudosense-annotated Dataset , 2013, HLT-NAACL.

[51]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[52]  Eneko Agirre,et al.  On the Use of Automatically Acquired Examples for All-Nouns Word Sense Disambiguation , 2008, J. Artif. Intell. Res..

[53]  Eneko Agirre,et al.  Random Walks for Knowledge-Based Word Sense Disambiguation , 2014, CL.

[54]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[55]  Hwee Tou Ng,et al.  An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation , 2002, EMNLP.

[56]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[57]  Thad Hughes,et al.  Lexical Semantic Relatedness with Random Graph Walks , 2007, EMNLP.

[58]  Johan Bos,et al.  Gamification for Word Sense Labeling , 2013, IWCS.

[59]  Hwee Tou Ng,et al.  It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text , 2010, ACL.

[60]  Stefano Faralli,et al.  A New Minimally-Supervised Framework for Domain Word Sense Disambiguation , 2012, EMNLP.

[61]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[62]  Roberto Navigli,et al.  Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction , 2013, CL.

[63]  Simone Paolo Ponzetto,et al.  Joining Forces Pays Off: Multilingual Joint Word Sense Disambiguation , 2012, EMNLP.

[64]  Olatz Ansa,et al.  Enriching WordNet concepts with topic signatures , 2001, ArXiv.

[65]  Rada Mihalcea,et al.  Bootstrapping Large Sense Tagged Corpora , 2002, LREC.

[66]  Katrin Erk,et al.  A Flexible, Corpus-Driven Model of Regular and Inverse Selectional Preferences , 2010, CL.

[67]  Ann Bies,et al.  The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[68]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[69]  Adam Kilgarriff,et al.  English Senseval: Report and Results , 2000, LREC.

[70]  Roberto Navigli,et al.  Semi-Automatic Extension of Large-Scale Linguistic Knowledge Bases , 2005, FLAIRS.

[71]  Preslav Nakov,et al.  Category-based Pseudowords , 2003, HLT-NAACL.

[72]  David Yarowsky,et al.  A method for disambiguating word senses in a large corpus , 1992, Comput. Humanit..

[73]  Stefan Bordag Word Sense Induction: Triplet-Based Clustering and Automatic Evaluation , 2006, EACL.

[74]  Roberto Navigli,et al.  Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity , 2013, ACL.

[75]  Tanja Gaustad,et al.  Statistical Corpus-Based Word Sense Disambiguation: Pseudowords vs. Real Ambiguous Words , 2001, ACL.

[76]  Iryna Gurevych,et al.  Dijkstra-WSA: A Graph-Based Approach to Word Sense Alignment , 2013, Transactions of the Association for Computational Linguistics.

[77]  Rebecca J. Passonneau,et al.  Annotating the MASC Corpus with BabelNet , 2014, LREC.

[78]  Roberto Navigli,et al.  Validating and Extending Semantic Knowledge Bases using Video Games with a Purpose , 2014, ACL.

[79]  George A. Miller,et al.  Using Corpus Statistics and WordNet Relations for Sense Identification , 1998, CL.

[80]  R. Navigli,et al.  A structural approach to the automatic adjudication of word sense disagreements , 2008, Natural Language Engineering.

[81]  Montse Cuadros,et al.  KnowNet: Building a Large Net of Knowledge from the Web , 2008, COLING.

[82]  Pushpak Bhattacharyya,et al.  All Words Domain Adapted WSD: Finding a Middle Ground between Supervision and Unsupervision , 2010, ACL.

[83]  Eneko Agirre,et al.  Unsupervised WSD based on Automatically Retrieved Examples: The Importance of Bias , 2004, EMNLP.

[84]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[85]  Tiziano Flati,et al.  The CQC Algorithm: Cycling in Graphs to Semantically Enrich and Enhance a Bilingual Dictionary: Extended abstract , 2012, IJCAI.

[86]  Kenneth Ward Church,et al.  Work on Statistical Methods for Word Sense Disambiguation , 1992 .

[87]  Roberto Navigli,et al.  SemEval-2013 Task 11: Word Sense Induction and Disambiguation within an End-User Application , 2013, SemEval@NAACL-HLT.

[88]  Keith Stevens,et al.  Measuring the Impact of Sense Similarity on Word Sense Induction , 2011, ULNLP@EMNLP.

[89]  Montse Cuadros,et al.  SemEval-2007 Task 16: Evaluation of Wide Coverage Knowledge Resources , 2007, SemEval@ACL.