论文信息 - From within-word model search to across-word model search in large vocabulary continuous speech recognition

From within-word model search to across-word model search in large vocabulary continuous speech recognition

In this paper we report on the application of across-word context dependent acoustic phoneme models in a single-pass large vocabulary continuous speech recognizer.Although across-word models are used by many groups today, only an outline of the recognizers is usually given in the publications. Implementation details are often missing.We present both a formal derivation of across-word model search and a detailed description of our implementation. The across-word model system is compared with a conventional within-word model system regarding word error rate and computational effort. Compared to the baseline within-word system a straightforward implementation of across-word model search results in a substantial increase of the computational effort. Therefore, several optimization steps are studied that result in a more efficient organization of the search space and a more efficient pruning. The effects of these optimizations are analysed in a detailed profiling. In combination they accelerate the straightforward implementation of across-word model search by nearly a factor of three.In addition we discuss the construction of word graphs during across-word model search. Starting from a word graph method based on within-word model search, we derive a formal specification of across-word word graphs. We show that the resulting word graphs are a good representation of the active search space.

Hermann Ney | Achim Sixtus

[1] Mei-Yuh Hwang,et al. Applying SPHINX-II to the DARPA Wall Street Journal CSR Task , 1992, HLT.

[2] Douglas B. Paul. An Efficient A* Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model , 1992, HLT.

[3] Wu Chou,et al. A unified approach of incorporating general features in decision tree based acoustic modeling , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4] Mei-Yuh Hwang,et al. An improved search algorithm using incremental knowledge for continuous speech recognition , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] Hermann Ney,et al. A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[6] F. Jelinek,et al. Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[7] Patrick Wambacq,et al. An efficient search space representation for large vocabulary continuous speech recognition , 2000, Speech Commun..

[8] Stefan Ortmanns,et al. High quality word graphs using forward-backward pruning , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[9] Stefan Ortmanns,et al. Dynamic programming search techniques for across-word modelling in speech recognition , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[10] H. Ney,et al. Improvements in beam search for 10000-word continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11] Lalit R. Bahl,et al. A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] R. Schwartz,et al. A comparison of several approximate algorithms for finding multiple (N-best) sentence hypotheses , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.