论文信息 - Application of confidence measures for dialogue systems through the use of parallel speech recognizers

Application of confidence measures for dialogue systems through the use of parallel speech recognizers

To assess the correctness of a recognizer output in any instance of a dialogue is a complex task that has been studied thoroughly during the past decade. Its importance relays on the need for robust dialogue systems, capable of dealing with difficulties inherent to human-machine communications: user errors and corrections, speech recognizer errors, error recovery techniques, etc. In this paper, we present a novel approach to the problem of deciding what the user has said. We use confidence measures derived from low level knowledge sources (acoustic and linguistic information) and generated in parallel from several topic-adapted speech recognizers. Each recognizer is aimed to the recognition of a particular topic, and confidence measures are compared through the use of a classifier that lead to a most probable solution. This approach shows to be specially suited for difficult topics, such as proper names or confirmations, which are highly meaningful for error correction tasks. These topics present high error rates when using an application-wide speech recognizer, but recognition correction is greatly enhanced through the use of parallel recognizers. Moreover, the use of topic-adapted recognizers seems to help also in the identification of the user intention and in the detection of outof-application utterances.

Carmen García-Mateo | David Pérez-Piñar López

[1] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[2] Wayne H. Ward,et al. Confidence measures for spoken dialogue systems , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3] Lin Lawrance Chase. Error-responsive feedback mechanisms for speech recognizers , 1997 .

[4] Carmen García-Mateo,et al. Adaptation strategies for the acoustic and language models in bilingual speech transcription , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[5] Vineet R. Khare,et al. Artificial Speciation of Neural Network Ensembles , 2005 .

[6] Thomas Schaaf,et al. Confidence measures for spontaneous speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7] Stephen Cox,et al. High-level approaches to confidence estimation in speech recognition , 2002, IEEE Trans. Speech Audio Process..