论文信息 - Fast n-gram language model look-ahead for decoders with static pronunciation prefix trees

Fast n-gram language model look-ahead for decoders with static pronunciation prefix trees

Decoders that make use of token-passing restrict their search space by various types of token pruning. With use of the Language Model Look-Ahead (LMLA) technique it is possible to increase the number of tokens that can be pruned without loss of decoding precision. Unfortunately, for token passing decoders that use single static pronunciation prefix trees, full n-gram LMLA increases the needed number of language model probability calculations considerably. In this paper a method for applying full n-gram LMLA in a decoder with a single static pronunciation tree is introduced. The experiments show that this method improves the speed of the decoder without an increase of search errors.

Franciska de Jong | Roeland Ordelman | Marijn Huijbregts

[1] Hermann Ney,et al. Improvements in beam search , 1994, ICSLP.

[2] Lin Lawrence Chase. Blame assignment for errors made by large vocabulary speech recognizers , 1997, EUROSPEECH.

[3] Hagen Soltau,et al. Efficient language model lookahead through polymorphic linguistic context assignment , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4] George Havas,et al. An Optimal Algorithm for Generating Minimal Perfect Hash Functions , 1992, Inf. Process. Lett..

[5] Carmen García-Mateo,et al. Fast LM look-ahead for large vocabulary continuous speech recognition using perfect hashing , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6] H. Ney,et al. LOOK-AHEAD TECHNIQUES FOR IMPROVED BEAM , 1996 .

[7] Georges Linarès,et al. Scalable language model look-ahead for LVCSR , 2005, INTERSPEECH.

[8] Patrick Wambacq,et al. An efficient search space representation for large vocabulary continuous speech recognition , 2000, Speech Commun..

[9] Detlef Koll,et al. Modeling and efficient decoding of large vocabulary conversational speech , 1999, EUROSPEECH.

[10] David A. van Leeuwen,et al. N-best: the northern- and southern-dutch benchmark evaluation of speech recognition technology , 2007, INTERSPEECH.