Chinese Functional Chunk Parsing Employing CRF and Semantic Information

We focus on building a system for labeling Chinese functional chunks automatically,through detecting the boundary of Chinese functional chunks and labeling the functional information in a sentence with correctly word segmenting and POS tagging.This paper proposes an approach that combines the feature template optimizing strategy with Conditional Random Field Model for labeling Chinese functional chunks automatically.On the testing data set,the precision,recall and F-1 measure of Chinese functional chunks reaches 85.84%,85.07% and 85.45% respectively.On the basis of that,existing language resources Chinese thesaurus "Tongyici Cilin" is introduced into the processing module,from which the semantic information will be added to the feature template to remit the effect of data sparseness and ambiguous problem.In this case,the three performance indexes are increased to 86.21%、85.31% and 85.76% respectively.