论文信息 - On Continuous Space Word Representations as Input of LSTM Language Model

On Continuous Space Word Representations as Input of LSTM Language Model

Artificial neural networks have become the state-of-the-art in the task of language modelling whereas Long-Short Term Memory LSTM networks seem to be an efficient architecture. The continuous skip-gram and thei¾źcontinuous bag of words CBOW are algorithms for learning quality distributed vector representations that are able to capture a large number of syntactic and semantic word relationships. In this paper, we carried out experiments with a combination of these powerful models: the continuous representations of words trained with skip-gram/CBOW/GloVe method, word cache expressed as a vector using latent Dirichlet allocation LDA. These all are used on the input of LSTM network instead of 1-of-N coding traditionally used in language models. The proposed models are tested on Penn Treebank and MALACH corpus.

Ludek Müller | Daniel Soutner

[1] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[3] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4] Hermann Ney,et al. LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[5] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6] Geoffrey Zweig,et al. Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[7] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[8] Petr Sojka,et al. Software Framework for Topic Modelling with Large Corpora , 2010 .

[9] Koray Kavukcuoglu,et al. Learning word embeddings efficiently with noise-contrastive estimation , 2013, NIPS.

[10] Ludek Müller,et al. Application of LSTM Neural Networks in Language Modelling , 2013, TSD.

[11] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[12] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[13] Tomas Mikolov,et al. RNNLM - Recurrent Neural Network Language Modeling Toolkit , 2011 .