Efficient backward decoding of high-order hidden Markov models

The forward-backward search (FBS) algorithm [S. Austin, R. Schwartz, P. Placeway, The forward-backward search algorithm, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 1991, pp. 697-700] has resulted in increases in speed of up to 40 in expensive time-synchronous beam searches in hidden Markov model (HMM) based speech recognition [R. Schwartz, S. Austin, Efficient, high-performance algorithms for N-best search, in: Proceedings of the Workshop on Speech and Natural Language, 1990, pp. 6-11; L. Nguyen, R. Schwartz, F. Kubala, P. Placeway, Search algorithms for software-only real-time recognition with very large vocabularies, in: Proceedings of the Workshop on Human Language Technology, 1993, pp. 91-95; A. Sixtus, S. Ortmanns, High-quality word graphs using forward-backward pruning, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 1999, pp. 593-596]. This is typically achieved by using a simplified forward search to decrease computation in the following detailed backward search. FBS implicitly assumes that forward and backward searches of HMMs are computationally equivalent. In this paper we present experimental results, obtained on the CallFriend database, that show that this assumption is incorrect for conventional high-order HMMs. Therefore, any improvement in computational efficiency that is gained by using conventional low-order HMMs in the simplified backward search of FBS is lost. This problem is solved by presenting a new definition of HMMs termed a right-context HMM, which is equivalent to conventional HMMs. We show that the computational expense of backward Viterbi-beam decoding right-context HMMs is similar to that of forward decoding conventional HMMs. Though not the subject of this paper, this allows us to more efficiently decode high-order HMMs, by capitalising on the improvements in computational efficiency that is obtained by using the FBS algorithm.

[1]  Hermann Ney,et al.  Improvements in beam search , 1994, ICSLP.

[2]  Johan A. du Preez Efficient training of high-order hidden Markov models using first-order representations , 1998, Comput. Speech Lang..

[3]  Stefan Ortmanns,et al.  High quality word graphs using forward-backward pruning , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4]  L. Rabiner,et al.  The acoustics, speech, and signal processing society - A historical perspective , 1984, IEEE ASSP Magazine.

[5]  Richard Washington,et al.  Learning to Automatically Detect Features for Mobile Robots Using Second-Order Hidden Markov Models , 2003, IJCAI 2003.

[6]  Bruce T. Lowerre,et al.  The HARPY speech recognition system , 1976 .

[7]  Abdelaziz Kriouile,et al.  Automatic word recognition based on second-order hidden Markov models , 1994, IEEE Trans. Speech Audio Process..

[8]  Jean-François Mari,et al.  A second-order HMM for high performance word and phoneme-based continuous speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[9]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[10]  Ben M. Herbst,et al.  Estimating the pen trajectories of static signatures using hidden Markov models , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[12]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[13]  A. Berchtold The double chain markov model , 1999 .

[14]  F. Jelinek,et al.  Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[15]  Herman A. Engelbrecht,et al.  Efficient Decoding of High-order Hidden Markov Models , 2007 .

[16]  Mosur Ravishankar,et al.  Efficient Algorithms for Speech Recognition. , 1996 .

[17]  L. Baum,et al.  An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology , 1967 .

[18]  A. Berchtold High-order extensions of the Double Chain Markov Model , 2002 .

[19]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .

[20]  Richard M. Schwartz,et al.  Efficient, High-Performance Algorithms for N-Best Search , 1990, HLT.

[21]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[22]  Richard M. Schwartz,et al.  Search Algorithms for Software-Only Real-Time Recognition with Very Large Vocabularies , 1993, HLT.

[23]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[24]  Hermann Ney,et al.  Data driven search organization for continuous speech recognition , 1992, IEEE Trans. Signal Process..

[25]  R. Bellman Dynamic programming. , 1957, Science.

[26]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  A. Poritz,et al.  Hidden Markov models: a guided tour , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[28]  Yoshua Bengio,et al.  Markovian Models for Sequential Data , 2004 .

[29]  S.E. Levinson,et al.  Structural methods in automatic speech recognition , 1985, Proceedings of the IEEE.

[30]  Lalit R. Bahl,et al.  Speech recognition with continuous-parameter hidden Markov models , 1987 .

[31]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[32]  Steve Austin,et al.  The forward-backward search algorithm , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.