论文信息 - Design of the CMU Sphinx-4 Decoder

Design of the CMU Sphinx-4 Decoder

The decoder of the sphinx-4 speech recognition system incorporates several new design strategies which have not been used earlier in conventional decoders of HMM-based large vocabulary speech recognition systems. Some new design aspects include graph construction for multilevel parallel decoding with independent simultaneous feature streams without the use of compound HMMs, the incorporation of a generalized search algorithm that subsumes Viterbi and full-forward decoding as special cases, design of generalized language HMM graphs from grammars and language models of multiple standard formats, that toggles trivially from flat search structure to tree search structure etc. This paper describes some salient design aspects of the Sphinx-4 decoder and includes preliminary performance measures relating to speed and accuracy.

[1] Steve Young,et al. Token passing: a simple conceptual model for connected speech recognition systems , 1989 .

[2] Paul Lamere,et al. Classification with free energy at raised temperatures , 2003, INTERSPEECH.

[3] Patti Price,et al. The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[4] Giridharan Iyengar,et al. Large-vocabulary audio-visual speech recognition by machines and humans , 2001, INTERSPEECH.