论文信息 - Word Sense Disambiguation with Semi-Supervised Learning

Word Sense Disambiguation with Semi-Supervised Learning

Current word sense disambiguation (WSD) systems based on supervised learning are still limited in that they do not work well for all words in a language. One of the main reasons is the lack of sufficient training data. In this paper, we investigate the use of unlabeled training data for WSD, in the framework of semi-supervised learning. Four semisupervised leaming algorithms are evaluated on 29 nouns of Senseval-2 (SE2) English lexical sample task and SE2 English all-words task. Empirical results show that unlabeled data can bring significant improvement in WSD accuracy.

Hwee Tou Ng | Wee Sun Lee | Thanh Phong Pham | H. Ng | Thanh Phong Pham

[1] Rada Mihalcea,et al. Co-training and Self-training for Word Sense Disambiguation , 2004, CoNLL.

[2] Christiane Fellbaum,et al. English Tasks: All-Words and Verb Lexical Sample , 2001, *SEMEVAL.

[3] Hwee Tou Ng,et al. An Empirical Evaluation of Knowledge Sources and Learning Algorithms for Word Sense Disambiguation , 2002, EMNLP.

[4] Adam Kilgarriff,et al. English Lexical Sample Task Description , 2001, *SEMEVAL.

[5] Claire Cardie,et al. Limitations of Co-Training for Natural Language Learning from Large Datasets , 2001, EMNLP.

[6] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[7] George A. Miller,et al. Using a Semantic Concordance for Sense Identification , 1994, HLT.

[8] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[9] Bernhard Schölkopf,et al. Learning from Labeled and Unlabeled Data Using Random Walks , 2004, DAGM-Symposium.

[10] Yoram Singer,et al. Unsupervised Models for Named Entity Classification , 1999, EMNLP.

[11] Thorsten Joachims,et al. Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[12] Steven P. Abney,et al. Bootstrapping , 2002, ACL.

[13] Avrim Blum,et al. Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[14] Sebastian Thrun,et al. Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.