Hierarchical topic classification for dialog speech recognition based on language model switching

A speech recognition architecture combining topic detection and topic-dependent language modeling is proposed. In this architecture, a hierarchical back-off mechanism is introduced to improve system robustness. Detailed topic models are applied when topic detection is confident, and wider models that cover multiple topics are applied in cases of uncertainty. In this paper, two topic detection methods are evaluated for the architecture: unigram likelihood and SVM (Support Vector Machine). On the ATR Basic Travel Expression corpus, both topic detection methods provide a comparable reductionin WER of 10.0% and 11.1% respectively over a single language model system. Finally the proposed re-decoding approach is compared with an equivalent system based on re-scoring. It is shown that redecoding is vital to provide optimal recognition performance.

[1]  Thorsten Joachims,et al.  Text categorization with support vector machines , 1999 .

[2]  Frank Wessel,et al.  Robust dialogue-state dependent language modeling using leaving-one-out , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[3]  Eiichiro Sumita,et al.  Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversations in the Real World , 2002, LREC.

[4]  Jun Wu,et al.  A maximum entropy language model integrating N-grams and topic dependencies for conversational speech recognition , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[5]  Joseph Polifroni,et al.  Organization, communication, and control in the GALAXY-II conversational system , 1999, EUROSPEECH.