Word Sense Disambiguation with Semi-Supervised Learning

Current word sense disambiguation (WSD) systems based on supervised learning are still limited in that they do not work well for all words in a language. One of the main reasons is the lack of sufficient training data. In this paper, we investigate the use of unlabeled training data for WSD, in the framework of semi-supervised learning. Four semisupervised leaming algorithms are evaluated on 29 nouns of Senseval-2 (SE2) English lexical sample task and SE2 English all-words task. Empirical results show that unlabeled data can bring significant improvement in WSD accuracy.