论文信息 - Time-first search for large vocabulary speech recognition

Time-first search for large vocabulary speech recognition

This paper describes a new search technique for large vocabulary speech recognition based on a stack decoder. Considerable memory savings are achieved with the combination of a tree based lexicon and a new search technique. The search proceeds time-first, that is partial path hypotheses are extended into the future in the inner loop and a tree walk over the lexicon is performed as an outer loop. Partial word hypotheses are grouped based on language model state. The stack maintains information about groups of hypotheses and whole groups are extended by one word to form new stack entries. An implementation is described of a one-pass decoder employing a 65000 word lexicon and a disk-based trigram language model. Real time operation is achieved with a small search error, a search space of about 5 Mbyte and a total memory usage of about 35 Mbyte.

Tony Robinson | James Christie | T. Robinson | J. Christie

[1] Jj Odell,et al. The Use of Context in Large Vocabulary Speech Recognition , 1995 .

[2] Steve Renals,et al. DECODER TECHNOLOGY FOR CONNECTIONIST LARGE VOCABULARY SPEECH RECOGNITION , 1995 .

[3] Steve Renals,et al. THE USE OF RECURRENT NEURAL NETWORKS IN CONTINUOUS SPEECH RECOGNITION , 1996 .

[4] David A. van Leeuwen,et al. Multilingual large vocabulary speech recognition: the European SQALE project , 1997, Comput. Speech Lang..

[5] Mosur Ravishankar,et al. Efficient Algorithms for Speech Recognition. , 1996 .

[6] Douglas B. Paul. An Efficient A* Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model , 1992, HLT.

[7] Douglas B. Paul,et al. An Efficient A* Stack Decoder Algorithm for Continuous Speech Recognition with a Stochastic Language Model , 1992, HLT.

[8] Richard M. Schwartz,et al. Efficient 2-pass n-best decoder , 1997, EUROSPEECH.