Correlation Voting Fusion Strategy for Part of Speech Tagging

Having studied four corpus-based approaches to part of speech (POS) tagging, such as transform-based error driven, the decision tree, hidden Markov model and maximum entropy, we present in this paper a novel data fusion strategy in POS tagging - correlation voting. Theoretical analysis and contrastive experiments with other fusion strategies show that linguistic knowledge for POS tagging can be more completely described by applying data fusion, and better tagging result can be achieved. The correlative voting is proved to be more outstanding than other fusion methods with a decrease of 27.85% in average tagging error rate