A Deep Attention Network for Chinese Word Segment

Character-level sequence label tagging is the most efficient way to solve unknown words problem for Chinese word segment. But the most widely used model, Conditional Random Fields (CRF), needs a large amount of manual design features. So it is appropriate to combine CRF and neural networks such as recurrent neural network (RNN), which is adopted in many natural language processing (NLP) tasks. However, RNN is rather slow because of the timing dependence between computations and not good at capturing local information of the sentence. In order to solve this problem, we introduce a self-attention mechanism, which completes the calculation between the different positions of the sentence with the same distance, into CWS. And we propose a deep neural network, which combines convolution neural networks and self-attention mechanism. Then, we evaluate the model on the PKU dataset and the MSR dataset. The results show that our model perform much better.

[1]  Tao Shen,et al.  DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding , 2017, AAAI.

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Xu Sun,et al.  A Discriminative Latent Variable Chinese Segmenter with Hybrid Word/Character Information , 2009, HLT-NAACL.

[4]  Hai Zhao,et al.  Fast and Accurate Neural Word Segmentation for Chinese , 2017, ACL.

[5]  Hai Zhao,et al.  Unsupervised Segmentation Helps Supervised Learning of Character Tagging for Word Segmentation and Named Entity Recognition , 2008, IJCNLP.

[6]  Xuanjing Huang,et al.  Long Short-Term Memory Neural Networks for Chinese Word Segmentation , 2015, EMNLP.

[7]  Weiwei Sun,et al.  Reducing Approximation and Estimation Errors for Chinese Lexical Processing with Heterogeneous Annotations , 2012, ACL.

[8]  Stephen Clark,et al.  Chinese Segmentation with a Word-Based Perceptron Algorithm , 2007, ACL.

[9]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[10]  Andrew McCallum,et al.  Chinese Segmentation and New Word Detection using Conditional Random Fields , 2004, COLING.

[11]  Hai Zhao,et al.  Neural Word Segmentation Learning for Chinese , 2016, ACL.

[12]  Yidong Chen,et al.  Deep Semantic Role Labeling with Self-Attention , 2017, AAAI.

[13]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[14]  Xuanjing Huang,et al.  Gated Recursive Neural Network for Chinese Word Segmentation , 2015, ACL.

[15]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[16]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[17]  Nianwen Xue,et al.  Chinese Word Segmentation as Character Tagging , 2003, ROCLING/IJCLCLP.

[18]  Xu Sun,et al.  Dependency-based Gated Recursive Neural Network for Chinese Word Segmentation , 2016, ACL.

[19]  Xuanjing Huang,et al.  Adversarial Multi-Criteria Learning for Chinese Word Segmentation , 2017, ACL.

[20]  Xu Sun,et al.  Exploring Representations from Unlabeled Data with Co-training for Chinese Word Segmentation , 2013, EMNLP.

[21]  Zhi-Hong Deng,et al.  A Gap-Based Framework for Chinese Word Segmentation via Very Deep Convolutional Networks , 2017, ArXiv.

[22]  Xiaoqing Zheng,et al.  Deep Learning for Chinese Word Segmentation and POS Tagging , 2013, EMNLP.

[23]  Daniel Jurafsky,et al.  A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005 , 2005, IJCNLP.

[24]  Bo Xu,et al.  Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation , 2017, IJCNLP.

[25]  Mirella Lapata,et al.  Long Short-Term Memory-Networks for Machine Reading , 2016, EMNLP.