论文信息 - An Efficient A* Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model

An Efficient A* Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model

The stack decoder is an attractive algorithm for controlling the acoustic and language model matching in a continuous speech recognizer. A previous paper described a near-optimal admissible Viterbi A* search algorithm for use with noncross-word acoustic models and no-grammar language models [16]. This paper extends this algorithm to include unigram language models and describes a modified version of the algorithm which includes the full (forward) decoder, cross-word acoustic models and longer-span language models. The resultant algorithm is not admissible, but has been demonstrated to have a low probability of search error and to be very efficient.

Douglas B. Paul

[1] Lalit R. Bahl,et al. A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Nils J. Nilsson,et al. Problem-solving methods in artificial intelligence , 1971, McGraw-Hill computer science series.

[3] James Glass,et al. Integration of speech recognition and natural language processing in the MIT VOYAGER system , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4] Steve Austin,et al. The forward-backward search algorithm , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[5] W. W. Bledsoe,et al. Review of "Problem-Solving Methods in Artificial Intelligence by Nils J. Nilsson", McGraw-Hill Pub. , 1971, SGAR.

[6] Volker Steinbiss,et al. Sentence-hypotheses generation in a continuous-speech recognition system , 1989, EUROSPEECH.

[7] Douglas B. Paul. The Lincoln tied-mixture HMM continuous speech recognizer , 1990 .

[8] Robert Roth,et al. A Rapid Match Algorithm for Continuous Speech Recognition , 1990, HLT.

[9] D. O'Shaughnessy,et al. A*-admissible heuristics for rapid lexical access , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[10] Donald E. Knuth,et al. The art of computer programming: sorting and searching (volume 3) , 1973 .

[11] Douglas B. Paul. New Results with the Lincoln Tied-Mixture HMM CSR System , 1991, HLT.

[12] F. Jelinek. Fast sequential decoding algorithm using a stack , 1969 .

[13] Bruce T. Lowerre,et al. The HARPY speech recognition system , 1976 .

[14] R. Schwartz,et al. A comparison of several approximate algorithms for finding multiple (N-best) sentence hypotheses , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[15] Lalit R. Bahl,et al. A fast approximate acoustic match for large vocabulary speech recognition , 1989, IEEE Trans. Speech Audio Process..

[16] Douglas B. Paul,et al. Speech Recognition Using Hidden Markov Models , 1990 .

[17] Frank K. Soong,et al. A Tree.Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition , 1990, HLT.

[18] Dimitri Kanevsky,et al. Matrix fast match: a fast method for identifying a short list of candidate words for decoding , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[19] Patti Price,et al. The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[20] A. Poritz,et al. Hidden Markov models: a guided tour , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[21] Douglas B. Paul,et al. Algorithms for an Optimal A* Search and Linearizing the Search in the Stack Decoder* , 1991, HLT.

[22] Douglas B. Paul. A CSR-NL Interface Specification Version 1.51 , 1989, HLT.