Two Knives Cut Better Than One: Chinese Word Segmentation with Dual Decomposition

There are two dominant approaches to Chinese word segmentation: word-based and character-based models, each with respective strengths. Prior work has shown that gains in segmentation performance can be achieved from combining these two types of models; however, past efforts have not provided a practical technique to allow mainstream adoption. We propose a method that effectively combines the strength of both segmentation schemes using an efficient dual-decomposition algorithm for joint inference. Our method is simple and easy to implement. Experiments on SIGHAN 2003 and 2005 evaluation datasets show that our method achieves the best reported results to date on 6 out of 7 datasets.

[1]  Keh-Jiann Chen,et al.  Word Identification for Mandarin Chinese Sentences , 1992, COLING.

[2]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[3]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[4]  Richard Sproat,et al.  The First International Chinese Word Segmentation Bakeoff , 2003, SIGHAN.

[5]  Nianwen Xu,et al.  Chinese Word Segmentation as Character Tagging , 2003, Int. J. Comput. Linguistics Chin. Lang. Process..

[6]  Changning Huang,et al.  Improved Source-Channel Models for Chinese Word Segmentation , 2003, ACL.

[7]  Thomas Emerson,et al.  The Second International Chinese Word Segmentation Bakeoff , 2005, IJCNLP.

[8]  Daniel Jurafsky,et al.  A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005 , 2005, IJCNLP.

[9]  Xihong Wu,et al.  Chinese Word Segmentation with Maximum Entropy and N-gram Language Model , 2006, SIGHAN@COLING/ACL.

[10]  Eiichiro Sumita,et al.  Subword-based Tagging by Conditional Random Fields for Chinese Word Segmentation , 2006, NAACL.

[11]  Galen Andrew,et al.  A Hybrid Markov/Semi-Markov Conditional Random Field for Sequence Segmentation , 2006, EMNLP.

[12]  Mengqiu Wang,et al.  A Dual-layer CRFs Based Joint Decoding Method for Cascaded Segmentation and Labeling Tasks , 2007, IJCAI.

[13]  Stephen Clark,et al.  Chinese Segmentation with a Word-Based Perceptron Algorithm , 2007, ACL.

[14]  Christopher D. Manning,et al.  Optimizing Chinese Word Segmentation for Machine Translation Performance , 2008, WMT@ACL.

[15]  Keh-Jiann Chen,et al.  Improving Word Alignment by Adjusting Chinese Word Segmentation , 2008, IJCNLP.

[16]  Xu Sun,et al.  A Discriminative Latent Variable Chinese Segmenter with Hybrid Word/Character Information , 2009, HLT-NAACL.

[17]  Dekang Lin Combining Language Modeling and Discriminative Classification for Word Segmentation , 2009, CICLing.

[18]  Alexander M. Rush,et al.  On Dual Decomposition and Linear Programming Relaxations for Natural Language Processing , 2010, EMNLP.

[19]  Chengqing Zong,et al.  A Character-Based Joint Model for Chinese Word Segmentation , 2010, COLING.

[20]  Weiwei Sun Word-based and Character-based Word Segmentation Models: Comparison and Combination , 2010, COLING.

[21]  Alexander M. Rush,et al.  Dual Decomposition for Parsing with Non-Projective Head Automata , 2010, EMNLP.

[22]  John DeNero,et al.  Model-Based Aligner Combination Using Dual Decomposition , 2011, ACL.

[23]  Alexander M. Rush,et al.  A Tutorial on Dual Decomposition and Lagrangian Relaxation for Inference in Natural Language Processing , 2012, J. Artif. Intell. Res..

[24]  Wanxiang Che,et al.  Joint Word Alignment and Bilingual Named Entity Recognition Using Dual Decomposition , 2013, ACL.

[25]  Dan Klein,et al.  An Empirical Examination of Challenges in Chinese Parsing , 2013, ACL.