Switch-LSTMs for Multi-Criteria Chinese Word Segmentation

Multi-criteria Chinese word segmentation is a promising but challenging task, which exploits several different segmentation criteria and mines their common underlying knowledge. In this paper, we propose a flexible multi-criteria learning for Chinese word segmentation. Usually, a segmentation criterion could be decomposed into multiple sub-criteria, which are shareable with other segmentation criteria. The process of word segmentation is a routing among these sub-criteria. From this perspective, we present Switch-LSTMs to segment words, which consist of several long short-term memory neural networks (LSTM), and a switcher to automatically switch the routing among these LSTMs. With these auto-switched LSTMs, our model provides a more flexible solution for multi-criteria CWS, which is also easy to transfer the learned knowledge to new criteria. Experiments show that our model obtains significant improvements on eight corpora with heterogeneous segmentation criteria, compared to the previous method and single-criterion learning.

[1]  F. Xia,et al.  The Part-Of-Speech Tagging Guidelines for the Penn Chinese Treebank (3.0) , 2000 .

[2]  Min Zhang,et al.  Coupled Sequence Labeling on Heterogeneous Annotations: POS Tagging as a Case Study , 2015, ACL.

[3]  Xuanjing Huang,et al.  Joint Chinese Word Segmentation and POS Tagging on Heterogeneous Annotated Corpora with Multiple Task Learning , 2013, EMNLP.

[4]  Hai Zhao,et al.  Neural Word Segmentation Learning for Chinese , 2016, ACL.

[5]  Xuanjing Huang,et al.  Gated Recursive Neural Network for Chinese Word Segmentation , 2015, ACL.

[6]  Xuanjing Huang,et al.  Long Short-Term Memory Neural Networks for Chinese Word Segmentation , 2015, EMNLP.

[7]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[8]  Thomas Emerson,et al.  The Second International Chinese Word Segmentation Bakeoff , 2005, IJCNLP.

[9]  Weiwei Sun,et al.  Reducing Approximation and Estimation Errors for Chinese Lexical Processing with Heterogeneous Annotations , 2012, ACL.

[10]  Min Zhang,et al.  Fast Coupled Sequence Labeling on Heterogeneous Annotations via Context-aware Pruning , 2016, EMNLP.

[11]  Min Zhang,et al.  Multi-Grained Chinese Word Segmentation , 2017, EMNLP.

[12]  Kevin Knight,et al.  Multi-Source Neural Translation , 2016, NAACL.

[13]  Yue Zhang,et al.  Neural Network for Heterogeneous Annotations , 2016, EMNLP.

[14]  Xiaoqing Zheng,et al.  Deep Learning for Chinese Word Segmentation and POS Tagging , 2013, EMNLP.

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  Zheng Huang,et al.  Bi-directional LSTM Recurrent Neural Network for Chinese Word Segmentation , 2016, ICONIP.

[17]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[18]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[19]  Yue Zhang,et al.  Transition-Based Neural Word Segmentation , 2016, ACL.

[20]  Baobao Chang,et al.  Max-Margin Tensor Neural Network for Chinese Word Segmentation , 2014, ACL.

[21]  Xuanjing Huang,et al.  Adversarial Multi-Criteria Learning for Chinese Word Segmentation , 2017, ACL.

[22]  Qun Liu,et al.  Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging - A Case Study , 2009, ACL/IJCNLP.

[23]  Xiao Chen,et al.  The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging , 2008, IJCNLP.

[24]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.