A study of neural network Russian language models for automatic continuous speech recognition systems

We show the results of studying models of the Russian language constructed with recurrent artificial neural networks for systems of automatic recognition of continuous speech. We construct neural network models with different number of elements in the hidden layer and perform linear interpolation of neural network models with the baseline trigram language model. The resulting models were used at the stage of rescoring the N best list. In our experiments on the recognition of continuous Russian speech with extra-large vocabulary (150 thousands of word forms), the relative reduction in the word error rate obtained after rescoring the 50 best list with the neural network language models interpolated with the trigram model was 14%.

[1]  Ian R. Lane,et al.  Neural network language models for low resource languages , 2014, INTERSPEECH.

[2]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[3]  Steve Young,et al.  The HTK book , 1995 .

[4]  Andreas Stolcke,et al.  SRILM at Sixteen: Update and Outlook , 2011 .

[5]  Alexey Karpov,et al.  Lexicon Size and Language Model Order Optimization for Russian LVCSR , 2013, SPECOM.

[6]  M. P. Farkhadov,et al.  Analysis of the oral interface in the interactive servicing systems. I , 2009 .

[7]  Konstantin Markov,et al.  Evaluation of Advanced Language Modeling Techniques for Russian LVCSR , 2013, SPECOM.

[8]  Jean-Luc Gauvain,et al.  Training Neural Network Language Models on Very Large Corpora , 2005, HLT.

[9]  Alexey A. Karpov An automatic multimodal speech recognition system with audio and video information , 2014, Autom. Remote. Control..

[10]  Lukás Burget,et al.  Strategies for training large scale neural network language models , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[11]  Tetsunori Kobayashi,et al.  Multiscale recurrent neural network based language model , 2015, INTERSPEECH.

[12]  Tatsuya Kawahara,et al.  Recent Development of Open-Source Speech Recognition Engine Julius , 2009 .

[13]  Geoffrey Zweig,et al.  Cache based recurrent neural network language model inference for first pass speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  R. M. Yusupov,et al.  Models and hardware-software solutions for automatic control of intelligent hall , 2011 .

[16]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[17]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[18]  Alexey Karpov,et al.  Recurrent neural network-based language modeling for an automatic Russian speech recognition system , 2015, 2015 Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT).

[19]  Tomas Mikolov,et al.  RNNLM - Recurrent Neural Network Language Modeling Toolkit , 2011 .