Two graph-based algorithms for state-of-the-art WSD

This paper explores the use of two graph algorithms for unsupervised induction and tagging of nominal word senses based on corpora. Our main contribution is the optimization of the free parameters of those algorithms and its evaluation against publicly available gold standards. We present a thorough evaluation comprising supervised and unsupervised modes, and both lexical-sample and all-words tasks. The results show that, in spite of the information loss inherent to mapping the induced senses to the gold-standard, the optimization of parameters based on a small sample of nouns carries over to all nouns, performing close to supervised systems in the lexical sample task and yielding the second-best WSD systems for the Senseval-3 all-words task.

[1]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[2]  Eneko Agirre,et al.  Exploring feature spaces with svd and unlabeled data for Word Sense Disambiguation , 2005 .

[3]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[4]  Dragomir R. Radev,et al.  LexRank: Graph-based Centrality as Salience in Text Summarization , 2004 .

[5]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[6]  Eneko Agirre,et al.  Evaluating and optimizing the parameters of an unsupervised graph-based WSD algorithm , 2006 .

[7]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[8]  Jean Véronis,et al.  HyperLex: lexical cartography for information retrieval , 2004, Comput. Speech Lang..

[9]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[10]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[11]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[12]  Martha Palmer,et al.  The English all-words task , 2004, SENSEVAL@ACL.

[13]  Paola Velardi,et al.  Structural semantic interconnections: a knowledge-based approach to word sense disambiguation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Rada Mihalcea,et al.  Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling , 2005, HLT.

[15]  Cheng Niu,et al.  Word Independent Context Pair Classification Model for Word Sense Disambiguation , 2005, CoNLL.

[16]  Ted Pedersen,et al.  Word Sense Discrimination by Clustering Contexts in Vector and Similarity Spaces , 2004, CoNLL.

[17]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[18]  George Karypis,et al.  Hierarchical Clustering Algorithms for Document Datasets , 2005, Data Mining and Knowledge Discovery.