Chinese Word Embeddings with Subwords

Word embeddings are very useful in variety of natural language processing tasks. Recently, more researches have focused on learning word embeddings with morphological knowledge of words, such as character and subword information. In this paper, we present a new method to use subwords and characters together to enhance word embeddings (SWE). In our model, we use subword and character vectors to modify the direction of word vectors, instead of adding them directly. We evaluate SWE on both word similarity task and analogical reasoning task. The results demonstrate that our model can learn better Chinese word embeddings than other baseline models.

[1]  Phil Blunsom,et al.  Compositional Morphology for Word Representations and Language Modelling , 2014, ICML.

[2]  Katrin Kirchhoff,et al.  Factored Neural Language Models , 2006, NAACL.

[3]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[4]  Zhiyuan Liu,et al.  A Unified Model for Word Sense Representation and Disambiguation , 2014, EMNLP.

[5]  Yoshua Bengio,et al.  Neural Probabilistic Language Models , 2006 .

[6]  Hao Xin,et al.  Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components , 2017, EMNLP.

[7]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[8]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[9]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[10]  Jun Zhao,et al.  How to Generate a Good Word Embedding , 2015, IEEE Intelligent Systems.

[11]  Xueqi Cheng,et al.  Inside Out: Two Jointly Predictive Models for Word Representations and Phrase Representations , 2016, AAAI.

[12]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[13]  Christopher D. Manning,et al.  Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[14]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[15]  Wenjie Li,et al.  Component-Enhanced Chinese Character Embeddings , 2015, EMNLP.

[16]  Zhiyuan Liu,et al.  Joint Learning of Character and Word Embeddings , 2015, IJCAI.

[17]  Andrew McCallum,et al.  Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space , 2014, EMNLP.

[18]  Tie-Yan Liu,et al.  Co-learning of Word Representations and Morpheme Representations , 2014, COLING.

[19]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[20]  Omer Levy,et al.  Linguistic Regularities in Sparse and Explicit Word Representations , 2014, CoNLL.

[21]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[22]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[23]  Andrew Y. Ng,et al.  Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[24]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).