Efficient search with posterior probability estimates in HMM-based speech recognition

In this paper we present the methods we developed to estimate posterior probabilities for HMM states in continuous and discrete HMM-based speech recognition systems and several ways to speed up decoding by using these posterior probability estimates. The proposed pruning techniques are state deactivation pruning (SDP), similar to an approach proposed for hybrid recognition systems, and a novel posteriori-based lookahead technique, posteriori lookahead pruning (PLP), that evaluates future posteriors in order to exclude unlikely HMM states as early as possible during search. By applying the proposed methods we managed to vastly reduce the decoding time consumed by our time-synchronous Viterbi-decoder for recognition systems based on the Verbmobil and the Wall Street Journal database with hardly any additional search error.

[1]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[2]  Gerhard Rigoll,et al.  A new approach to generalized mixture tying for continuous HMM-based speech recognition , 1997, EUROSPEECH.

[3]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[4]  Xuedong Huang,et al.  On semi-continuous hidden Markov modeling , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[5]  H. Ney,et al.  Improvements in beam search for 10000-word continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Christoph Neukirchen,et al.  Reduced lexicon trees for decoding in a MMIi-connectionist/HMM speech recognition system , 1997, EUROSPEECH.

[7]  Steve Renals,et al.  Efficient search using posterior phone probability estimates , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Vassilios Digalakis,et al.  Genones: generalized mixture tying in continuous hidden Markov model-based speech recognizers , 1996, IEEE Trans. Speech Audio Process..