论文信息 - GGP: Glossary Guided Post-processing for Word Embedding Learning

GGP: Glossary Guided Post-processing for Word Embedding Learning

Word embedding learning is the task to map each word into a low-dimensional and continuous vector based on a large corpus. To enhance corpus based word embedding models, researchers utilize domain knowledge to learn more distinguishable representations via joint optimization and post-processing based models. However, joint optimization based models require much training time. Existing post-processing models mostly consider semantic knowledge while learned embedding models show less functional information. Glossary is a comprehensive linguistic resource. And in previous works, the glossary is usually used to enhance the word representations via joint optimization based methods. In this paper, we post-process pre-trained word embedding models with incorporating the glossary and capture more topical and functional information. We propose GGP (Glossary Guided Post-processing word embedding) model which consists of a global post-processing function to fine-tune each word vector, and an auto-encoding model to learn sense representations, furthermore, constrains each post-processed word representation and the composition of its sense representations to be similar. We evaluate our model by comparing it with two state-of-the-art models on six word topical/functional similarity datasets, and the results show that it outperforms competitors by an average of 4.1% across all datasets. And our model outperforms GloVe by more than 7%.

Ruosong Yang | Jiannong Cao | Zhiyuan Wen

[1] Christophe Gravier,et al. Dict2vec : Learning Word Embeddings using Lexical Dictionaries , 2017, EMNLP.

[2] Felix Hill,et al. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[3] Elia Bruni,et al. Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[4] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.

[5] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[6] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[7] Chris Callison-Burch,et al. PPDB: The Paraphrase Database , 2013, NAACL.

[8] Goran Glavas,et al. Explicit Retrofitting of Distributional Word Vectors , 2018, ACL.

[9] John B. Lowe,et al. The Berkeley FrameNet Project , 1998, ACL.

[10] Pascal Vincent,et al. Auto-Encoding Dictionary Definitions into Consistent Word Embeddings , 2018, EMNLP.

[11] Evgeniy Gabrilovich,et al. A word at a time: computing word relatedness using temporal semantic analysis , 2011, WWW.