A robust system for human-machine dialogue in telephony-based applications

This paper presents a real-time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions even for people with poor recognition performance. The robust behaviour of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows the system to deal with partial or total breakdowns at other levels of analysis. We report the field trial data of the system with respect to speech recognition metrics of word accuracy and sentence understanding rate, time-to-completion, time-to-acquisition of crucial parameters, and degree of success of the interactions in providing the speakers with the information they required. The evaluation data show that most of the subjects were able to interact fruitfully with the system. These results suggest that the design choices made to achieve robust behaviour are a promising way to create usable spoken language telephone systems.

[1]  Teuvo Kohonen,et al.  Speech recognition: a hybrid approach , 1998 .

[2]  Morena Danieli,et al.  Metrics for Evaluating Dialogue Strategies in a Spoken Language System , 1996, ArXiv.

[3]  Anthony J. Robinson,et al.  An application of recurrent nets to phone probability estimation , 1994, IEEE Trans. Neural Networks.

[4]  François Andry Static and dynamic predictions : a method to improve speech understanding in cooperative dialogues , 1992, ICSLP.

[5]  Richard M. Schwartz,et al.  The N-Best Algorithm: Efficient Procedure for Finding Top N Sentence Hypotheses , 1989, HLT.

[6]  Morena Danieli,et al.  On the use of expectations for detecting and repairing human-machine miscommunication , 1997, AAAI 1996.

[7]  D. Richard Hipp,et al.  Spoken Natural Language Dialog Systems: A Practical Approach , 1994 .

[8]  Roberto Pieraccini,et al.  Syntax driven recognition of connected words by Markov models , 1984, ICASSP.

[9]  Heinrich Niemann,et al.  Combining stochastic and linguistic language models for recognition of spontaneous speech , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[10]  Hervé Bourlard,et al.  Connectionist Speech Recognition: A Hybrid Approach , 1993 .

[11]  Hermann Ney,et al.  On structuring probabilistic dependences in stochastic language modelling , 1994, Comput. Speech Lang..

[12]  Morena Danieli,et al.  Managing dialogue in a continuous speech understanding system , 1993, EUROSPEECH.

[13]  Roberto Gemello,et al.  Recurrent network automata for speech recognition: a summary of recent work , 1994, Proceedings of IEEE Workshop on Neural Networks for Signal Processing.

[14]  Pietro Laface,et al.  Acoustic-phonetic modeling for flexible vocabulary speech recognition , 1995, EUROSPEECH.

[15]  Eric K. Ringger,et al.  A Robust System for Natural Spoken Dialogue , 1996, ACL.