A Recurrent Neural Network Language Model Based on Word Embedding

Language model is one of the basic research issues of natural language processing, and which is the premise for realizing more complicated tasks such as speech recognition, machine translation and question answering system. In recent years, neural network language model has become a research hotspot, which greatly enhances the application effect of language model. In this paper, a recurrent neural network language model (RNNLM) based on word embedding is proposed, and the word embedding of each word is generated by pre-training the text data with skip-gram model. The n-gram language model, RNNLM based on one-hot and RNNLM based on word embedding are evaluated on three different public datasets. The experimental results show that the RNNLM based on word embedding performs best, and which can reduce the perplexity of language model significantly.