Study on Chinese question classification based on SVM multi-category classification

Support vector machine (SVM) was initially used for binary classification. How to generalize the result of two-class classification to multi-class classification has been a problem which needs to be more investigated and studied. A general overview of existing representative methods for multi-category support vector machines was presented and their performances were compared in the paper. Then, the algorithms were applied in the Chinese question classification. Chinese question classification hierarchy and the feature selection of the question were also discussed in the paper. Then, The four algorithms of SVM multi-category classification were applied to Chinese question classification and some contrast experiments were done. The result of the experiments has shown that the binary-tree algorithm is more effective than the other algorithms in the Chinese question classification.

[1]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[2]  Susan T. Dumais,et al.  Improving the retrieval of information from external sources , 1991 .

[3]  Koby Crammer,et al.  On the Learnability and Design of Output Codes for Multiclass Problems , 2002, Machine Learning.

[4]  Jason Weston,et al.  Support vector machines for multi-class pattern recognition , 1999, ESANN.

[5]  Isabelle Guyon,et al.  Comparison of classifier methods: a case study in handwritten digit recognition , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).