论文信息 - Incremental Skip-gram Model with Negative Sampling - 字舞流文

Incremental Skip-gram Model with Negative Sampling

This paper explores an incremental training strategy for the skip-gram model with negative sampling (SGNS) from both empirical and theoretical perspectives. Existing methods of neural word embeddings, including SGNS, are multi-pass algorithms and thus cannot perform incremental model update. To address this problem, we present a simple incremental extension of SGNS and provide a thorough theoretical analysis to demonstrate its validity. Empirical experiments demonstrated the correctness of the theoretical analysis as well as the practical usefulness of the incremental algorithm.

Nobuhiro Kaji | Hayato Kobayashi | Nobuhiro Kaji | Hayato Kobayashi

[1] Jeffrey Scott Vitter,et al. Random sampling with a reservoir , 1985, TOMS.

[2] Elia Bruni,et al. Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[3] Petr Sojka,et al. Software Framework for Topic Modelling with Large Corpora , 2010 .

[4] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[5] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[6] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[7] Hal Daumé,et al. Approximate Scalable Bounded Space Sketch for Large Data NLP , 2011, EMNLP.

[8] Omer Levy,et al. Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[9] Felix Hill,et al. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[10] Wenpeng Yin,et al. Online Updating of Word Representations for Part-of-Speech Tagging , 2015, EMNLP.

[11] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[12] Kevin Duh,et al. Streaming Word Embeddings with the Space-Saving Algorithm , 2017, ArXiv.

[13] Geoffrey Zweig,et al. Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[14] Jianxin Li,et al. Incrementally Learning the Hierarchical Softmax Function for Neural Language Models , 2017, AAAI.

[15] Pavlos S. Efraimidis,et al. Weighted Random Sampling over Data Streams , 2010, Algorithms, Probability, Networks, and Games.

[16] Eneko Agirre,et al. A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[17] Patrick Pantel,et al. From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[18] Jayadev Misra,et al. Finding Repeated Elements , 1982, Sci. Comput. Program..

[19] Alessandro Lenci,et al. Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.