Morpheme-Enhanced Spectral Word Embedding

Traditional word embedding models only learn word-level semantic information from corpus while neglect the valuable semantic information of words’ internal structures such as morphemes. To address this problem, the goal of this paper is to exploit the morphological information to enhance the quality of word embeddings. Based on spectral method, we propose two word embedding models: Morpheme on Original view and Morpheme on Context view (MOMC) and Morpheme on Context view (MC). In vector space of MOMC and MC, both semanticsimilar words and morphological-similar words locate near with each other. In experiments, MOMC, MC and the baselines are tested on word similarity and sentiment classification. The results show that our models outperform all comparative baselines on six datasets of word similarity and win the first on sentiment classification as well. Based on a large German corpus, we also inspect the ability of word embeddings to process morphemerich languages by using German word similarity task. The result shows that MOMC and MC significantly outperform the baselines more than 5 percentage on one dataset and nearly 4 percentage on the other. These impressive improvements demonstrate the effectiveness of our models in dealing with morpheme-rich languages like German.

[1]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[2]  Alexander M. Rush,et al.  Character-Aware Neural Language Models , 2015, AAAI.

[3]  Evgeniy Gabrilovich,et al.  Large-scale learning of word relatedness with constraints , 2012, KDD.

[4]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[5]  Manaal Faruqui,et al.  Community Evaluation and Exchange of Word Vectors at wordvectors.org , 2014, ACL.

[6]  Huanhuan Chen,et al.  Improve Chinese Word Embeddings by Exploiting Internal Structure , 2016, NAACL.

[7]  Zhiyuan Liu,et al.  Topical Word Embeddings , 2015, AAAI.

[8]  Manaal Faruqui,et al.  Non-distributional Word Vector Representations , 2015, ACL.

[9]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[10]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[11]  Christopher D. Manning,et al.  Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[12]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[13]  Zhiyuan Liu,et al.  Joint Learning of Character and Word Embeddings , 2015, IJCAI.

[14]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[15]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[16]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Dean P. Foster,et al.  Eigenwords: spectral word embeddings , 2015, J. Mach. Learn. Res..