Word Sense Disambiguation Using Vectors of Co-occurrence Information

This paper reports on the word sense disambiguation of Korean noun by using co-occurrence information in context. For a given noun, its local contextual word distribution is not enough to express their semantic characteristics for noun sense disambiguation. This paper proposes a cluster-based sense as a base vector. Contextual noise is removed by a term weighting method, and hypernyms of remaining contextual words are used to modify the base vector so as to enhance the discrimination. This hypernym is extracted from the dictionary definitional pattern with some loss of precision. The most dominant sense in the training data set is used when the failed sense disambiguation. The Korean SENSEVAL test suite is used for this experimentation and our method leads up to 42% precision improvement.