论文信息 - Dynamic programming search techniques for across-word modelling in speech recognition

Dynamic programming search techniques for across-word modelling in speech recognition

We describe the integration of across-word models in the RWTH large vocabulary continuous speech recognition system, where our main focus is on the realization of the acoustic recognition process. This paper presents a study of two search methods based on the principle of dynamic programming. For both methods we discuss the implementation details and give experimental results on the Verbmobil and on the Wall Street Journal data. In addition, we introduce a score interpolation of within-word and across-word models for both search methods. In combination with across-word models this interpolation technique gives an improvement of the recognition accuracy by 14% relative to our standard system.

Stefan Ortmanns | Klaus Beulen | Christian Elting

[1] Steve J. Young,et al. A One Pass Decoder Design For Large Vocabulary Recognition , 1994, HLT.

[2] Hermann Ney,et al. A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[3] Richard M. Schwartz,et al. Efficient 2-pass n-best decoder , 1997, EUROSPEECH.

[4] Hermann Ney,et al. Extensions to the word graph method for large vocabulary continuous speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] Hermann Ney,et al. The RWTH large vocabulary continuous speech recognition system , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6] F. Alleva. Search organization in the Whisper continuous speech recognition system , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[7] Qiru Zhou,et al. An approach to continuous speech recognition based on layered self-adjusting decoding graph , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8] Hermann Ney,et al. State tying for context dependent phoneme models , 1997, EUROSPEECH.

[9] R. Schwartz,et al. A comparison of several approximate algorithms for finding multiple (N-best) sentence hypotheses , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.