Extending boosting for call classification using word confusion networks

We are interested in the problem of robust understanding from noisy spontaneous speech input. In goal driven human-machine dialog, utterance classification is a key component of the understanding process to determine the intent of the speaker. We propose a novel algorithm for exploiting ASR word confidence scores for better classification of spoken utterances. Word confidence scores for automatic speech recognition (ASR) provide estimates for word error rates. While previous work has focused on straightforward combination of word confidence scores into Bayesian classifiers, we extend the mathematical formulation for boosting classifiers. This extension of the algorithm allows confidence scores to be exploited from a 1-best ASR output or from word confusion networks (WCNs). We present methods for on-line and off-line score combinations. The results we show are for a large database of utterances collected using the AT&T VoiceTone/sup SM/ spoken dialog system. Our experiments show between 5% and 10% reduction in error (1-precision) for a given recall using WCNs compared to ASR output.

[1]  Gökhan Tür,et al.  Improving spoken language understanding using word confusion networks , 2002, INTERSPEECH.

[2]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[3]  Joseph Polifroni,et al.  Recognition confidence scoring and its use in speech understanding systems , 2002, Comput. Speech Lang..

[4]  Andreas Stolcke,et al.  Finding consensus in speech recognition: word error minimization and other applications of confusion networks , 2000, Comput. Speech Lang..

[5]  Dilek Z. Hakkani-Tür,et al.  A general algorithm for word graph matrix decomposition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6]  Giuseppe Riccardi,et al.  Integration of utterance verification with statistical language modeling and spoken language understanding , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[8]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.