Pattern Learning and Active Feature Selection for Word Sense Disambiguation

We present here the main ideas of the algorithm employed in the SMUls and SMUaw systems. These systems have participated in the Senseval-2 competition attaining the best performance for both English all words and English lexical sample tasks. The algorithm has two main components (1) pattern learning from available sense tagged corpora (SemCor) and dictionary definitions (WordNet), and (2) instance based learning with active feature selection, when training data is available for a particular word.