Exploiting repair context in interactive error recovery

In current speech applications, facilities to correct recognition errors are limited to either choosing among alternative hypotheses (either by voice or by mouseclick) or respeaking. Information from the context a repair is ignored. We developed a method which improves the accuracy of correcting speech recognition errors interactively by taking into account the context of the repair interaction. The basic idea is to use the same language modeling information used in the initial decoding of continuous speech input for decoding (isolated word) repair input. The repair is not limited to speech, but the user can choose to switch modality, for instance spelling or handwriting a word. We implemented this idea by rescoring N-best lists obtained from decoding the repair input using language model scores for trigrams which include the corrected word. We evaluated the method on a set of repairs by respeaking, spelling and handwriting which we collected with our prototypical continuous speech dictation interface. The method can increase the accuracy of repair significantly, compared to recognizing the repair input as independent event.

[1]  Catherine G. Wolf,et al.  The Use of Hand-Drawn Gestures for Text Editing , 1987, Int. J. Man Mach. Stud..

[2]  Alexander H. Waibel,et al.  Interactive recovery from speech recognition errors in speech user interfaces , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3]  Morena Danieli,et al.  On the use of expectations for detecting and repairing human-machine miscommunication , 1997, AAAI 1996.

[4]  Alexander H. Waibel,et al.  NPen/sup ++/: a writer independent, large vocabulary on-line cursive handwriting recognition system , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[5]  Dean Rubine,et al.  The automatic recognition of gestures , 1992 .

[6]  Alexander H. Waibel,et al.  Speaker-independent connected letter recognition with a multi-state time delay neural network , 1992, EUROSPEECH.

[7]  Alexander H. Waibel,et al.  Improving recognizer acceptance through robust, natural speech repair , 1994, ICSLP.

[8]  Sharon L. Oviatt,et al.  Error resolution during multimodal human-computer interaction , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  William A. Ainsworth,et al.  Feedback Strategies for Error Correction in Speech Recognition Systems , 1992, Int. J. Man Mach. Stud..