SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses

Most work on word sense disambiguation has assumed that word usages are best labeled with a single sense. However, contextual ambiguity or fine-grained senses can potentially enable multiple sense interpretations of a usage. We present a new SemEval task for evaluating Word Sense Induction and Disambiguation systems in a setting where instances may be labeled with multiple senses, weighted by their applicability. Four teams submitted nine systems, which were evaluated in two settings.

[1]  Deniz Yuret,et al.  FASTSUBS: An Efficient Admissible Algorithm for Finding the Most Likely Lexical Substitutes Using a Statistical Language Model , 2012, ArXiv.

[2]  Katrin Erk,et al.  Investigations on Word Senses and Word Usages , 2009, ACL.

[3]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[4]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[5]  Sergei Vassilvitskii,et al.  Generalized distances between rankings , 2010, WWW '10.

[6]  Leon Danon,et al.  Comparing community structure identification , 2005, cond-mat/0505245.

[7]  Alistair Moffat,et al.  Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.

[8]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[9]  Nizar Habash,et al.  Inter-annotator Agreement on a Multilingual Semantic Annotation Task , 2006, LREC.

[10]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[11]  David Jurgens,et al.  An Evaluation of Graded Sense Disambiguation using Word Sense Induction , 2012, *SEMEVAL.

[12]  Christiane Fellbaum,et al.  The MASC Word Sense Corpus , 2012, LREC.

[13]  Eneko Agirre,et al.  Evaluating Word Sense Induction and Discrimination Systems , 2007 .

[14]  Hui Xiong,et al.  Information-Theoretic Distance Measures for Clustering Validation: Generalization and Normalization , 2009, IEEE Transactions on Knowledge and Data Engineering.

[15]  Adam Kilgarriff,et al.  English Lexical Sample Task Description , 2001, *SEMEVAL.

[16]  Jean Véronis,et al.  A study of polysemy judgements and inter-annotator agreement , 1999 .

[17]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[18]  Katrin Erk,et al.  Graded Word Sense Assignment , 2009, EMNLP.

[19]  Rebecca Green,et al.  Lexical knowledge and human disagreement on a WSD task , 2004, Comput. Speech Lang..

[20]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[21]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[22]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[23]  Eneko Agirre,et al.  Personalizing PageRank for Word Sense Disambiguation , 2009, EACL.

[24]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2009, Information Retrieval.

[25]  Roberto Navigli,et al.  SemEval-2013 Task 12: Multilingual Word Sense Disambiguation , 2013, *SEMEVAL.

[26]  Julio Gonzalo,et al.  The role of named entities in Web People Search , 2009, EMNLP.

[27]  Nancy Ide,et al.  The American National Corpus First Release , 2004, LREC.

[28]  Andrea Lancichinetti,et al.  Detecting the overlapping and hierarchical community structure in complex networks , 2008, 0802.1218.

[29]  James Pustejovsky,et al.  Word Sense Inventories by Non-Experts , 2012, LREC.

[30]  Nancy Ide,et al.  Multiplicity and word sense: evaluating and learning from multiply labeled word sense annotations , 2012, Lang. Resour. Evaluation.

[31]  Ron Artstein,et al.  Survey Article: Inter-Coder Agreement for Computational Linguistics , 2008, CL.

[32]  Chris Biemann,et al.  Crowdsourcing WordNet , 2009 .

[33]  Bill Keller,et al.  MaxMax: A Graph-Based Soft Clustering Algorithm Applied to Word Sense Induction , 2013, CICLing.

[34]  Elie Bienenstock,et al.  Sphere Embedding: An Application to Part-of-Speech Induction , 2010, NIPS.

[35]  James Bailey,et al.  Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..

[36]  Nicoletta Calzolari,et al.  Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC-2014) , 2014, LREC 2014.

[37]  Suresh Manandhar,et al.  SemEval-2010 Task 14: Word Sense Induction &Disambiguation , 2010, SemEval@ACL.

[38]  Derek Greene,et al.  Normalized Mutual Information to evaluate overlapping community finding algorithms , 2011, ArXiv.

[39]  Timothy Baldwin,et al.  Word Sense Induction for Novel Sense Detection , 2012, EACL.

[40]  David Jurgens,et al.  Embracing Ambiguity: A Comparison of Annotation Methodologies for Crowdsourcing Word Sense Labels , 2013, NAACL.

[41]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.