Word sense disambiguation with pattern learning and automatic feature selection

This paper presents a novel approach for word sense disambiguation. The underlying algorithm has two main components: (1) pattern learning from available sense-tagged corpora (SemCor), from dictionary definitions (WordNet) and from a generated corpus (GenCor); and (2) instance based learning with automatic feature selection, when training data is available for a particular word. The ideas described in this paper were implemented in a system that achieves excellent performance on the data provided during the SENSEVAL-2 evaluation exercise, for both English all words and English lexical sample tasks.

[1]  Walter Daelemans,et al.  Forgetting Exceptions is Harmful in Language Learning , 1998, Machine Learning.

[2]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[3]  Walter Daelemans,et al.  Memory-Based Word Sense Disambiguation , 2000, Comput. Humanit..

[4]  Shlomo Argamon,et al.  A Memory-Based Approach to Learning Shallow Natural Language Patterns , 1998, ACL.

[5]  Raymond J. Mooney,et al.  Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning , 1996, EMNLP.

[6]  Adam Kilgarriff,et al.  Special issue on SENSEVAL: Evaluating word sense disambiguation programs , 2000 .

[7]  George A. Miller,et al.  Using Corpus Statistics and WordNet Relations for Sense Identification , 1998, CL.

[8]  Scott Cotton,et al.  SENSEVAL-2: Overview , 2001, *SEMEVAL.

[9]  Yorick Wilks,et al.  The Interaction of Knowledge Sources in Word Sense Disambiguation , 2001, CL.

[10]  Ted Pedersen,et al.  A Decision Tree of Bigrams is an Accurate Predictor of Word Sense , 2001, NAACL.

[11]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[12]  Rada Mihalcea,et al.  Bootstrapping Large Sense Tagged Corpora , 2002, LREC.

[13]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[14]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[15]  Rada Mihalcea,et al.  An Automatic Method for Generating Sense Tagged Corpora , 1999, AAAI/IAAI.

[16]  Claire Cardie,et al.  Automating Feature Set Selection for Case-Based Learning of Linguistic Knowledge , 1996, EMNLP.

[17]  Janyce Wiebe,et al.  Decomposable Modeling in Natural Language Processing , 1999, CL.

[18]  Walter Daelemans,et al.  TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .

[19]  David Yarowsky,et al.  Hierarchical Decision Lists for Word Sense Disambiguation , 2000, Comput. Humanit..

[20]  D. Id,et al.  Evaluating sense disambiguation across diverse parameter spaces , 2002 .

[21]  Eric Brill,et al.  Pattern-Based Disambiguation for Natural Language Processing , 2000, EMNLP.

[22]  Hwee Tou Ng,et al.  Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach , 1996, ACL.

[23]  George A. Miller,et al.  A Semantic Concordance , 1993, HLT.

[24]  David W. Aha,et al.  Feature Selection for Case-Based Classification of Cloud Types: An Empirical Comparison , 1994 .

[25]  Rada Mihalcea,et al.  An Iterative Approach to Word Sense Disambiguation , 2000, FLAIRS.