Full-words Automatic Word Sense Tagging Based on Unsupervised Learning Algorithm

For the purpose of implementing automatic Chinese word sense tagging, this paper presents a new method for word sense disambiguation based on unsupervised machine learning strategies. Four models of word sense disambiguation are built and compared. The model with two unsupervised machine learning strategies and selecting contextual features using dependence grammar obtains the best performance. And it can be trained with large-scale corpus to deal with the problem of data sparseness. In addition, it has such characteristics as high accuracy, high speed, easy extension and so on. Thus this technique is competent for word sense tagging on large-scale real-world text.