A SPOKEN DIALOG SYSTEM WITH AUTOMATIC RECOVERY MECHANISM FROM MISRECOGNITION

We proposed a novel dialog strategy which can recover from mis- recognition through a spoken dialog. To recover from the misrecog- nition without confirmation, our system kept multiple understanding hypotheses at each turn and 'searched' for a globally optimal hypothesis across user's utterances in a whole dialog. As for a dialog strategy, we introduced a new criterion based on 'efficiency for convergence' and 'consistency with understanding hypotheses' to select an appropriate system response. Using such criterion, the system removes the ambiguity without making the user feel unnatural in relation to the response conflicting with actual user intent. We also proposed to adopt the repetition utterance detection to update the understanding hypotheses. We developed a spoken dialog system using these techniques and showed some dialog examples in which misrecognition was naturally corrected. We also showed that our strategy was efficient in terms of the number of turns.

[1]  Joseph Polifroni,et al.  Recognition confidence scoring and its use in speech understanding systems , 2002, Comput. Speech Lang..

[2]  Seiichi Nakagawa,et al.  Detection and recognition of correction utterance in spontaneously spoken dialog , 2003, INTERSPEECH.

[3]  Norihito Yasuda,et al.  Efficient spoken dialogue control depending on the speech recognition rate and system's database , 2003, INTERSPEECH.

[4]  Katsuhito Sudoh,et al.  Incorporating discourse features into confidence scoring of intention recognition results in spoken dialogue systems , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[5]  Mikio Nakano,et al.  An efficient dialogue control method under system²s limited knowledge , 2000, INTERSPEECH.

[6]  Hauke Schramm,et al.  The thoughtful elephant: strategies for spoken dialog systems , 2000, IEEE Trans. Speech Audio Process..

[7]  Atsuhiko Kai,et al.  An understanding strategy based on plausibility score in recognition history using CSR confidence measure , 2004, INTERSPEECH.

[8]  Atsuhiko Kai,et al.  A frame-synchronous continuous speech recognition algorithm using a top-down parsing of context-free grammar , 1992, ICSLP.

[9]  Tatsuya Kawahara,et al.  Flexible Mixed-Initiative Dialogue Management using Concept-Level Confidence Measures of Speech Recognizer Output , 2000, COLING.

[10]  Kawahara Tatsuya,et al.  Speech-based Information Retrieval System with Clarification Dialogue Strategy , 2005, EMNLP 2005.

[11]  Tatsuya Kawahara,et al.  Speech-based Information Retrieval System with Clarification Dialogue Strategy , 2005, HLT/EMNLP.