论文信息 - Word sense disambiguation with pattern learning and automatic feature selection

Word sense disambiguation with pattern learning and automatic feature selection

This paper presents a novel approach for word sense disambiguation. The underlying algorithm has two main components: (1) pattern learning from available sense-tagged corpora (SemCor), from dictionary definitions (WordNet) and from a generated corpus (GenCor); and (2) instance based learning with automatic feature selection, when training data is available for a particular word. The ideas described in this paper were implemented in a system that achieves excellent performance on the data provided during the SENSEVAL-2 evaluation exercise, for both English all words and English lexical sample tasks.

Rada Mihalcea | Rada Mihalcea

[1] Walter Daelemans,et al. Forgetting Exceptions is Harmful in Language Learning , 1998, Machine Learning.

[2] Thomas G. Dietterich,et al. Learning with Many Irrelevant Features , 1991, AAAI.

[3] Walter Daelemans,et al. Memory-Based Word Sense Disambiguation , 2000, Comput. Humanit..

[4] Shlomo Argamon,et al. A Memory-Based Approach to Learning Shallow Natural Language Patterns , 1998, ACL.

[5] Raymond J. Mooney,et al. Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning , 1996, EMNLP.

[6] Adam Kilgarriff,et al. Special issue on SENSEVAL: Evaluating word sense disambiguation programs , 2000 .

[7] George A. Miller,et al. Using Corpus Statistics and WordNet Relations for Sense Identification , 1998, CL.

[8] Scott Cotton,et al. SENSEVAL-2: Overview , 2001, *SEMEVAL.

[9] Yorick Wilks,et al. The Interaction of Knowledge Sources in Word Sense Disambiguation , 2001, CL.

[10] Ted Pedersen,et al. A Decision Tree of Bigrams is an Accurate Predictor of Word Sense , 2001, NAACL.

[11] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.