A new word-confidence threshold technique to enhance the performance of spoken dialogue systems

Spoken dialogue systems generally use one or two confidence thresholds during speech recognition. A confidence value assigned to a word represents the recognizer’s confidence in the correct recognition of the word. If the confidence value is under a threshold then the word is considered a recognition error and the system must ask the user to re-enter it. Alternatively, the system can ask for a confirmation from the user. Environmental conditions and peculiarities of the speaker’s voice can change from one dialogue to another, so that it is necessary to decide the most appropriate value for the confidence threshold. If the selected value is too low, the words that are wrongly inserted by the recognizer may be considered correctly recognized. On the other hand, if the selected value is too high, even the words actually uttered by the user can be considered recognition errors, or words that must be confirmed. In this paper we present an experimental strategy to automatically select the most appropriate value for the confidence threshold. This strategy has been applied to the dialogue system we have developed, which aims to deal with telephone-based fast food queries and orders. We present the results obtained and indicate possibilities for future work .