Chinese Chunking based on Conditional Random Fields

In this paper, we proposed an approach for Chinese chunking based on the Conditional Random Fields model (CRFs). For sequence labeling, CRFs has advantages over generative models. Furthermore, Chinese chunking is a difficult sequence labeling task. This paper describes how to use CRFs for Chinese chunking via capturing the arbitrary and overlapping features. We defined different types of features for the model, and then studied their effects on the data set of the UPENN Chinese TreeBank-4(CTB4). For comparison, we also applied the other models to the task on the same data set. The experimental results show that the proposed approach can achieve better performance than the other models.