Chinese Chunk Identification Using SVMs Plus Sigmoid

The paper presents a method of Chinese chunk recognition based on Support Vector Machines (SVMs) plus Sigmoid. It is well known that SVMs are binary classifiers which achieve the best performance in many tasks. However, directly applying binary classifiers in the task of Chinese chunking will face the dilemmas that either two or more different class labels are given to a single unlabeled constituent, or no class labels are given for some unlabeled constituents. Employing sigmoid functions is a method of extracting probabilities (class/input) from SVMs outputs, which is helpful to post-processing of classification. These probabilities are then used to resolve the dilemmas. We compare our method based on SVMs plus Sigmoid with methods based only on SVMs. The experiments show that significant improvements have been achieved.

[1]  Qiang Zhou,et al.  Chinese Base-Phrases Chunking , 2002, SIGHAN@COLING.

[2]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[3]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[4]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[5]  Yuji Matsumoto,et al.  Unknown Word Guessing and Part-of-Speech Tagging Using Support Vector Machines , 2001, NLPRS.

[6]  Jun'ichi Tsujii,et al.  Tuning support vector machines for biomedical named entity recognition , 2002, ACL Workshop on Natural Language Processing in the Biomedical Domain.

[7]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[8]  松本 裕治,et al.  Japanese Named Entity Extraction using Support Vector Machines , 2001 .

[9]  Robert C. Berwick,et al.  Principle-Based Parsing: Computation and Psycholinguistics , 1991 .

[10]  Dustin Boswell,et al.  Introduction to Support Vector Machines , 2002 .

[11]  Steven Abney,et al.  Parsing By Chunks , 1991 .

[12]  Yuji Matsumoto,et al.  Chunking with Support Vector Machines , 2001, NAACL.

[13]  Yuji Matsumoto,et al.  Use of Support Vector Learning for Chunk Identification , 2000, CoNLL/LLL.

[14]  Zhang Yu Automatic Identification of Chinese Base Phrases , 2002 .