Incorporating Word Attention into Character-Based Word Segmentation

Neural network models have been actively applied to word segmentation, especially Chinese, because of the ability to minimize the effort in feature engineering. Typical segmentation models are categorized as character-based, for conducting exact inference, or word-based, for utilizing word-level information. We propose a character-based model utilizing word information to leverage the advantages of both types of models. Our model learns the importance of multiple candidate words for a character on the basis of an attention mechanism, and makes use of it for segmentation decisions. The experimental results show that our model achieves better performance than the state-of-the-art models on both Japanese and Chinese benchmark datasets.

[1]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[2]  Zhi-Hong Deng,et al.  A Gap-Based Framework for Chinese Word Segmentation via Very Deep Convolutional Networks , 2017, ArXiv.

[3]  Yuji Matsumoto,et al.  Applying Conditional Random Fields to Japanese Morphological Analysis , 2004, EMNLP.

[4]  Nianwen Xue,et al.  Chinese Comma Disambiguation for Discourse Analysis , 2012, ACL.

[5]  Daisuke Kawahara,et al.  Morphological Analysis for Unsegmented Languages using Recurrent Neural Network Language Model , 2015, EMNLP.

[6]  Yijia Liu,et al.  Exploring Segment Representations for Neural Segmentation Models , 2016, IJCAI.

[7]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[8]  Andrew McCallum,et al.  Chinese Segmentation and New Word Detection using Conditional Random Fields , 2004, COLING.

[9]  Kikuo Maekawa,et al.  Balanced corpus of contemporary written Japanese , 2013, Language Resources and Evaluation.

[10]  Hai Zhao,et al.  Unsupervised Segmentation Helps Supervised Learning of Character Tagging for Word Segmentation and Named Entity Recognition , 2008, IJCNLP.

[11]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[12]  Yue Zhang,et al.  Word-Context Character Embeddings for Chinese Word Segmentation , 2017, EMNLP.

[13]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[14]  Xuanjing Huang,et al.  Gated Recursive Neural Network for Chinese Word Segmentation , 2015, ACL.

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  Baobao Chang,et al.  Max-Margin Tensor Neural Network for Chinese Word Segmentation , 2014, ACL.

[17]  Xu Sun,et al.  Dependency-based Gated Recursive Neural Network for Chinese Word Segmentation , 2016, ACL.

[18]  Yue Zhang,et al.  Neural Word Segmentation with Rich Pretraining , 2017, ACL.

[19]  M. A. R T A P A L,et al.  The Penn Chinese TreeBank: Phrase structure annotation of a large corpus , 2005, Natural Language Engineering.

[20]  Xiaoqing Zheng,et al.  Deep Learning for Chinese Word Segmentation and POS Tagging , 2013, EMNLP.

[21]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.

[22]  Zhiyuan Liu,et al.  Neural Relation Extraction with Selective Attention over Instances , 2016, ACL.

[23]  Thomas Emerson,et al.  The Second International Chinese Word Segmentation Bakeoff , 2005, IJCNLP.

[24]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[25]  Hai Zhao,et al.  Fast and Accurate Neural Word Segmentation for Chinese , 2017, ACL.

[26]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[27]  Bo Xu,et al.  Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation , 2017, IJCNLP.

[28]  Sampo Pyysalo,et al.  Attending to Characters in Neural Sequence Labeling Models , 2016, COLING.

[29]  Graham Neubig,et al.  Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis , 2011, ACL.

[30]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[31]  Baobao Chang,et al.  Feature-based Neural Language Model and Chinese Word Segmentation , 2013, IJCNLP.

[32]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[33]  Xuanjing Huang,et al.  Long Short-Term Memory Neural Networks for Chinese Word Segmentation , 2015, EMNLP.

[34]  Mamoru Komachi,et al.  Long Short-Term Memory for Japanese Word Segmentation , 2017, PACLIC.

[35]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[36]  Xuanjing Huang,et al.  A Feature-Enriched Neural Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging , 2016, IJCAI.

[37]  Jörg Tiedemann,et al.  Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF , 2017, IJCNLP.

[38]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[39]  Nianwen Xu,et al.  Chinese Word Segmentation as Character Tagging , 2003, Int. J. Comput. Linguistics Chin. Lang. Process..

[40]  Yue Zhang,et al.  Transition-Based Neural Word Segmentation , 2016, ACL.

[41]  Yue Zhang,et al.  Subword Encoding in Lattice LSTM for Chinese Word Segmentation , 2018, NAACL.

[42]  Hai Zhao,et al.  Neural Word Segmentation Learning for Chinese , 2016, ACL.

[43]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[44]  Ji Ma,et al.  State-of-the-art Chinese Word Segmentation with Bi-LSTMs , 2018, EMNLP.