Chinese Word Segmentation with Conditional Support Vector Inspired Markov Models

In this paper, we present the proposed method of participating SIGHAN-2010 Chinese word segmentation bake-off. In this year, our focus aims to quick train and test the given data. Unlike the most structural learning algorithms, such as conditional random fields, we design an in-house development conditional support vector Markov model (CMM) framework. The method is very quick to train and also show better performance in accuracy than CRF. To give a fair comparison, we compare our method to CRF with three additional tasks, namely, CoNLL-2000 chunking, SIGHAN-3 Chinese word segmentation. The results were encourage and indicated that the proposed CMM produces better not only accuracy but also training time efficiency. The official results in SIGHAN-2010 also demonstrates that our method perform very well in traditional Chinese with fine-tuned features set.

[1]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[2]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[3]  Sabine Buchholz,et al.  Introduction to the CoNLL-2000 Shared Task Chunking , 2000, CoNLL/LLL.

[4]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[5]  Yuji Matsumoto,et al.  Chunking with Support Vector Machines , 2001, NAACL.

[6]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[7]  Tong Zhang,et al.  Text Chunking based on a Generalization of Winnow , 2002, J. Mach. Learn. Res..

[8]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[9]  Hwee Tou Ng,et al.  Chinese Part-of-Speech Tagging: One-at-a-Time or All-at-Once? Word-Based or Character-Based? , 2004, EMNLP.

[10]  Yuji Matsumoto,et al.  Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.

[11]  S. Sathiya Keerthi,et al.  A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs , 2005, J. Mach. Learn. Res..

[12]  Hwee Tou Ng,et al.  A Maximum Entropy Approach to Chinese Word Segmentation , 2005, SIGHAN@IJCNLP 2005.

[13]  Daniel Marcu,et al.  Learning as search optimization: approximate large margin methods for structured prediction , 2005, ICML.

[14]  Tong Zhang,et al.  A High-Performance Semi-Supervised Learning Method for Text Chunking , 2005, ACL.

[15]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[16]  Yue-Shi Lee,et al.  Efficient and Robust Phrase Chunking Using Support Vector Machines , 2006, AIRS.

[17]  Yu-Chieh Wu,et al.  Description of the NCU Chinese Word Segmentation and Named Entity Recognition System for SIGHAN Bakeoff 2006 , 2006, SIGHAN@COLING/ACL.

[18]  Gina-Anne Levow,et al.  The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition , 2006, SIGHAN@COLING/ACL.

[19]  Jun Suzuki,et al.  Semi-Supervised Structured Output Learning Based on a Hybrid Generative and Discriminative Approach , 2007, EMNLP.

[20]  Yue-Shi Lee,et al.  Multilingual Deterministic Dependency Parsing Framework using Modified Finite Newton Method Support Vector Machines , 2007, EMNLP.

[21]  Hai Zhao Incorporating Global Information into Supervised Learning for Chinese Word Segmentation , 2007 .

[22]  Yue-Shi Lee,et al.  Description of the NCU Chinese Word Segmentation and Part-of-Speech Tagging for SIGHAN Bakeoff 2007 , 2008, IJCNLP.

[23]  Yue-Shi Lee,et al.  Robust and Efficient Chinese Word Dependency Analysis with Linear Kernel Support Vector Machines , 2008, COLING.

[24]  Jun Suzuki,et al.  Semi-Supervised Sequential Labeling and Segmentation Using Giga-Word Scale Unlabeled Data , 2008, ACL.

[25]  Xiao Chen,et al.  The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging , 2008, IJCNLP.

[26]  Yue-Shi Lee,et al.  Robust and efficient multiclass SVM models for phrase pattern recognition , 2008, Pattern Recognit..

[27]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.