A New Method of the Automatically Marked Chinese Part of Speech Based on Gaussian Prior Smoothing Maximum Entropy Model

With its many virtues, maximum entropy (ME) model has been favored in natural language processing. Because of the limitation of the training data, the parameters sparse phenomenon is serious in Chinese part of speech. The model is prone to over fit training data, therefore some smoothing method should be applied on maximum entropy model. While several smoothing methods for maximum entropy models have been proposed to address this problem, Gaussian prior smoothing method has an outstanding performance. Based on this smoothing maximum entropy model and characteristics of Chinese, a new Chinese part-of-speech system is presented. Result of experiment shows that it works well.