Improving Supervised Sense Disambiguation with Web-Scale Selectors

This paper introduces a method to improve supervised word sense disambiguation performance by including a new class of features which leverage contextual information from large unannotated corpora. This new feature class, selectors, contains words that appear in other corpora with the same local context as a given lexical instance. We show that support vector sense classifiers trained with selectors achieve higher accuracy than those trained only with standard features, producing error reductions of 15.4% and 6.9% on standard coarse-grained and fine-grained disambiguation tasks respectively. Furthermore, we find an error reduction of 9.3% when including selectors for the classification step of named-entity recognition over a representative sample of OntoNotes. These significant improvements come free of any human annotation cost, only requiring unlabeled Web-Scale corpora.

[1]  Hwee Tou Ng,et al.  Word Sense Disambiguation with Semi-Supervised Learning , 2005, AAAI.

[2]  Martha Palmer,et al.  SemEval-2007 Task-17: English Lexical Sample, SRL and All Words , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[3]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[4]  Deniz Yuret,et al.  KU: Word Sense Disambiguation by Substitution , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[5]  Rada Mihalcea,et al.  Using Wikipedia for Automatic Word Sense Disambiguation , 2007, NAACL.

[6]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[7]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[8]  Dekang Lin,et al.  Automatic Retrieval and Clustering of Similar Words , 1998, ACL.

[9]  Rada Mihalcea,et al.  Co-training and Self-training for Word Sense Disambiguation , 2004, CoNLL.

[10]  Yee Whye Teh,et al.  NUS-ML: Improving Word Sense Disambiguation Using Topic Features , 2007, SemEval@ACL.

[11]  Hwee Tou Ng,et al.  Word Sense Disambiguation Using OntoNotes: An Empirical Study , 2008, EMNLP.

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Hwee Tou Ng,et al.  NUS-PT: Exploiting Parallel Texts for Word Sense Disambiguation in the English All-Words Tasks , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[14]  Dekang Lin,et al.  Using Syntactic Dependency as Local Context to Resolve Word Sense Ambiguity , 1997, ACL.

[15]  Martha Palmer,et al.  Novel Semantic Features for Verb Sense Disambiguation , 2008, ACL.

[16]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[17]  Rada Mihalcea,et al.  An Automatic Method for Generating Sense Tagged Corpora , 1999, AAAI/IAAI.

[18]  Heng Ji,et al.  New Tools for Web-Scale N-grams , 2010, LREC.

[19]  Alexander F. Gelbukh,et al.  An Innovative Two-Stage WSD Unsupervised Method , 2008, Proces. del Leng. Natural.

[20]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[21]  Piek T. J. M. Vossen,et al.  SemEval-2010 Task 17: All-Words Word Sense Disambiguation on a Specific Domain , 2009, *SEMEVAL.

[22]  Roberto Navigli,et al.  SemEval-2007 Task 07: Coarse-Grained English All-Words Task , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[23]  Fernando Gomez,et al.  Using Web Selectors for the Disambiguation of All Words , 2009, SEW@NAACL-HLT.

[24]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[25]  Randy Goebel,et al.  Web-Scale N-gram Models for Lexical Disambiguation , 2009, IJCAI.

[26]  P. Delany,et al.  The Digital Word: Text-Based Computing in the Humanities , 1993 .

[27]  Philip Resnik,et al.  Selectional Preference and Sense Disambiguation , 1997 .

[28]  Jianfeng Gao,et al.  Exploring web scale language models for search query processing , 2010, WWW '10.

[29]  Jeremy H. Clear,et al.  The British national corpus , 1993 .

[30]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[31]  Fernando Gomez,et al.  Acquiring Knowledge from the Web to be used as Selectors for Noun Sense Disambiguation , 2008, CoNLL.

[32]  Christopher D. Manning,et al.  Joint Parsing and Named Entity Recognition , 2009, NAACL.

[33]  Adam Kilgarriff,et al.  The Senseval-3 English lexical sample task , 2004, SENSEVAL@ACL.

[34]  Naftali Tishby,et al.  Distributional Clustering of English Words , 1993, ACL.

[35]  Dekang Lin,et al.  Creating Robust Supervised Classifiers via Web-Scale N-Gram Data , 2010, ACL.

[36]  Hwee Tou Ng,et al.  An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation , 2002, EMNLP.

[37]  Doug Downey,et al.  Locating Complex Named Entities in Web Text , 2007, IJCAI.

[38]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .

[39]  Julie Weeds,et al.  Finding Predominant Word Senses in Untagged Text , 2004, ACL.