论文信息 - One pass cross word decoding for large vocabularies based on a lexical tree search organization

One pass cross word decoding for large vocabularies based on a lexical tree search organization

This paper describes the new Philips Research decoder that performs large vocabulary continuous speech recognition in a single pass for cross-word acoustic models and an m-gram language model (with m up to 4) as opposed to our previous technique of multiple passes. The decoder is based on a time-synchronous beam search and a prex tree structure of the lexicon. Cross-word transitions are treated dynamically. A language-model look-ahead technique is applied on the bigram probabilities. On a variety of speech data, reduced error rates are obtained together with signi cant speed-ups con rming the advantage of an early use of all available knowledge sources. In particular, the search e ort of a one-pass trigram decoding is only marginally increased compared to bigram and the integration of cross-word triphones improves the overall accuracy by typically 10% relative.

Xavier L. Aubert | X. Aubert

[1] Hermann Ney,et al. Language-model look-ahead for large vocabulary speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2] Andreas Wendemuth,et al. The philips/RWTH system for transcription of broadcast news , 1999, EUROSPEECH.

[3] Mei-Yuh Hwang,et al. Improvements on the pronunciation prefix tree search organization , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[4] Hermann Ney,et al. Large vocabulary continuous speech recognition of Wall Street Journal data , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[5] Hermann Ney,et al. A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[6] Hermann Ney,et al. Improvements in beam search for 10000-word continuous-speech recognition , 1994, IEEE Trans. Speech Audio Process..

[7] Hermann Ney,et al. Improvements in beam search , 1994, ICSLP.

[8] Andreas Wendemuth,et al. Automatic Transcription of English Broadcast News , 1998 .

[9] Andreas Wendemuth,et al. Acoustic Modeling in the Philips Hub-4 Continuous-Speech Recognition System , 1998 .

[10] Hermann Ney,et al. A comparison of time conditioned and word conditioned search techniques for large vocabulary speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[11] Steve J. Young,et al. A One Pass Decoder Design For Large Vocabulary Recognition , 1994, HLT.

[12] Hermann Ney,et al. Large vocabulary continuous speech recognition using word graphs , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.