Multi-Criteria Chinese Word Segmentation with Transformer

Different linguistic perspectives cause many diverse segmentation criteria for Chinese word segmentation (CWS). Most existing methods focus on improving the performance of single-criterion CWS. However, it is interesting to exploit these heterogeneous segmentation criteria and mine their common underlying knowledge. In this paper, we propose a concise and effective model for multi-criteria CWS, which utilizes a shared fully-connected self-attention model to segment the sentence according to a criterion indicator. Experiments on eight datasets with heterogeneous segmentation criteria show that the performance of each corpus obtains a significant improvement, compared to single-criterion learning.

[1]  Weiwei Sun,et al.  Reducing Approximation and Estimation Errors for Chinese Lexical Processing with Heterogeneous Annotations , 2012, ACL.

[2]  Lei Wu,et al.  Effective Neural Solution for Multi-Criteria Word Segmentation , 2017, ArXiv.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[5]  Xuanjing Huang,et al.  Adversarial Multi-Criteria Learning for Chinese Word Segmentation , 2017, ACL.

[6]  Nianwen Xue,et al.  Chinese Word Segmentation as Character Tagging , 2003, ROCLING/IJCLCLP.

[7]  Baobao Chang,et al.  Max-Margin Tensor Neural Network for Chinese Word Segmentation , 2014, ACL.

[8]  Xuanjing Huang,et al.  Gated Recursive Neural Network for Chinese Word Segmentation , 2015, ACL.

[9]  Xiao Chen,et al.  The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging , 2008, IJCNLP.

[10]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[11]  Xuanjing Huang,et al.  Long Short-Term Memory Neural Networks for Chinese Word Segmentation , 2015, EMNLP.

[12]  Xuanjing Huang,et al.  Joint Chinese Word Segmentation and POS Tagging on Heterogeneous Annotated Corpora with Multiple Task Learning , 2013, EMNLP.

[13]  Jörg Tiedemann,et al.  Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF , 2017, IJCNLP.

[14]  Qun Liu,et al.  Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging - A Case Study , 2009, ACL/IJCNLP.

[15]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[16]  Yue Zhang,et al.  Transition-Based Neural Word Segmentation , 2016, ACL.

[17]  Wang Ling,et al.  Two/Too Simple Adaptations of Word2Vec for Syntax Problems , 2015, NAACL.

[18]  Hai Zhao,et al.  Neural Word Segmentation Learning for Chinese , 2016, ACL.

[19]  Xiaoqing Zheng,et al.  Deep Learning for Chinese Word Segmentation and POS Tagging , 2013, EMNLP.

[20]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[21]  Ji Ma,et al.  State-of-the-art Chinese Word Segmentation with Bi-LSTMs , 2018, EMNLP.

[22]  Nan Yu,et al.  A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[23]  Yue Zhang,et al.  Neural Network for Heterogeneous Annotations , 2016, EMNLP.

[24]  Xipeng Qiu,et al.  Switch-LSTMs for Multi-Criteria Chinese Word Segmentation , 2018, AAAI.

[25]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[26]  Fei Xia The Part-Of-Speech Tagging Guidelines for the Penn Chinese Treebank (3.0) , 2000 .

[27]  Thomas Emerson,et al.  The Second International Chinese Word Segmentation Bakeoff , 2005, IJCNLP.

[28]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[29]  Wei Chu,et al.  Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning , 2020, COLING.

[30]  Min Zhang,et al.  Coupled Sequence Labeling on Heterogeneous Annotations: POS Tagging as a Case Study , 2015, ACL.