Chinese NP Chunking: A Semi-Supervised Approach

V N and N V sequence in Chinese may be a noun phrase. This characteristic makes NP chunking in Chinese particularly difficult. We present a method to tackle this problem by combining Chinese Sinica Treebank data with unlabelled data to train a better model based on SVM. Experiments with open test data show that our proposed semi-supervised approach can achieve the accuracy of 78.79% in f-measure, enhancing the f-measure by 8.79% over the supervised approach.