论文信息 - An Improved Mandarin Voice Input System Using Recurrent Neural Network Language Model

An Improved Mandarin Voice Input System Using Recurrent Neural Network Language Model

In this paper, we present our recent work on using a Recurrent Neural Network Language Model (RNNLM) in a Mandarin voice input system. Specifically, the RNNLM is used in conjunction with a large high-order n-gram language model (LM) to re-score the N-best list. However, it is observed that the repeated computations in the rescoring procedure can make the rescoring inefficient. Therefore, we propose a new nbest-list rescoring framework called Prefix Tree based N-best list Rescore (PTNR) to totally eliminate the repeated computations and speed up the rescoring procedure. Experiments show that the RNNLM leads to about 4.5% relative reduction of word error rate (WER). And, compared to the conventional n-best list rescoring method, the PTNR gets a speed-up of factor 3-4. Compared to the cache based method, the design of PTNR is more explicit and simpler. Besides, the PTNR requires a smaller memory footprint than the cache based method.

Yonghong Yan | Zhen Zhang | Yujing Si | Jielin Pan | Ji Xu

[1] Lukás Burget,et al. Empirical Evaluation and Combination of Advanced Language Modeling Techniques , 2011, INTERSPEECH.

[2] Stanley F. Chen,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[3] Kenneth Ward Church,et al. A Fast Re-scoring Strategy to Capture Long-Distance Dependencies , 2011, EMNLP.

[4] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[5] Tomas Mikolov,et al. RNNLM - Recurrent Neural Network Language Modeling Toolkit , 2011 .

[6] Yonghong Yan,et al. A One-Pass Real-Time Decoder Using Memory-Efficient State Network , 2008, IEICE Trans. Inf. Syst..

[7] Kenneth Heafield,et al. KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.

[8] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9] Lukás Burget,et al. Recurrent Neural Network Based Language Modeling in Meeting Recognition , 2011, INTERSPEECH.

[10] Holger Schwenk,et al. Continuous space language models , 2007, Comput. Speech Lang..