Testing dialogue systems by means of automatic generation of conversations

This paper presents a novel technique that allows testing spoken dialogue systems by means of an automatic generation of conversations. The technique permits to easily test spoken dialogue systems under a variety of lab-simulated conditions, as it is easy to vary or change the utterance corpus used to check the performance of the system. The technique is based on the use of a module called user simulator whose purpose is to behave as real users when they interact with dialogue systems. The behaviour of the simulator is decided by means of diverse scenarios that represent the goals of the users. The simulator aim is to achieve the goals set in the scenarios during the interaction with the dialogue system. We have applied the technique to test a dialogue system developed in our lab. The test has been carried out considering different levels of white and babble noise as well as a VTS noise compensation technique. The results prove that the dialogue system performance is worse under the babble noise conditions. The VTS technique has been effective when dealing with noisy utterances and has lead to better experimental results, particularly for the white noise. The technique has permitted to detect problems in the dialogue strategies employed to handle confirmation turns and recognition errors, suggesting that these strategies must be improved. q 2002 Elsevier Science B.V. All rights reserved.

[1]  Ramón López-Cózar,et al.  A new word-confidence threshold technique to enhance the performance of spoken dialogue systems , 1999, EUROSPEECH.

[2]  Don McAllaster,et al.  Studies in acoustic training and language modeling using simulated speech data , 1999, EUROSPEECH.

[3]  Emiel Krahmer,et al.  Problem spotting in human-machine interaction , 1999, EUROSPEECH.

[4]  Michael F. McTear,et al.  Software to support research and development of spoken dialogue systems , 1999, EUROSPEECH.

[5]  Victor Zue,et al.  PEGASUS: A spoken dialogue interface for on-line air travel planning , 1994, Speech Communication.

[6]  Dafydd Gibbon,et al.  Handbook of Multimodal and Spoken Dialogue Systems: Resources, Terminology and Product Evaluation , 2000, Computational Linguistics.

[7]  Morena Danieli,et al.  Dialogos: a robust system for human-machine spoken dialogue on the telephone , 1996, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  Roberto Pieraccini,et al.  A stochastic model of human-machine interaction for learning dialog strategies , 2000, IEEE Trans. Speech Audio Process..

[9]  Chin-Hui Lee,et al.  On natural language call routing , 2000, Speech Commun..

[10]  Roberto Pieraccini,et al.  User modeling for spoken dialogue system evaluation , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[11]  Thomas Niesler,et al.  The 1998 HTK system for transcription of conversational telephone speech , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[12]  Michael F. McTear,et al.  Integrating flexibility into a structured dialogue model: some design considerations , 2000, INTERSPEECH.

[13]  Thomas Hain,et al.  The 1998 HTK broadcast news transcription system: development and results , 1999 .

[14]  Joseph Polifroni,et al.  Galaxy-II as an Architecture for Spoken Dialogue Evaluation , 2000, LREC.

[15]  Ute Ehrlich Task hierarchies representing sub-dialogs in speech dialog systems , 1999, EUROSPEECH.

[16]  Chin-Hui Lee,et al.  Automatic dialogue generator creates user defined applications , 1999, EUROSPEECH.

[17]  Victor Zue,et al.  JUPlTER: a telephone-based conversational interface for weather information , 2000, IEEE Trans. Speech Audio Process..

[18]  Roberto Pieraccini,et al.  The use of belief networks for mixed-initiative dialog modeling , 2000, IEEE Trans. Speech Audio Process..

[19]  Allen L. Gorin,et al.  Knowledge collection for natural language spoken dialog systems , 1999, EUROSPEECH.

[20]  Kamel Smaïli,et al.  Automatic and manual clustering for large vocabulary speech recognition: a comparative study , 1999, EUROSPEECH.

[21]  Masahiro Araki,et al.  A task-independent dialogue controller based on the extended frame-driven method , 2000, INTERSPEECH.

[22]  Thilo Pfau,et al.  Speaker normalization and pronunciation variant modeling: helpful methods for improving recognition of fast speech , 1999, EUROSPEECH.

[23]  Mari Ostendorf,et al.  Variable n-grams and extensions for conversational speech language modeling , 2000, IEEE Trans. Speech Audio Process..

[24]  Lin-Shan Lee,et al.  Consistent dialogue across concurrent topics based on an expert system model , 1999, EUROSPEECH.

[25]  Roy Rada,et al.  Interacting WITH Computers , 1989, Interact. Comput..

[26]  James F. Allen Natural language understanding , 1987, Bejnamin/Cummings series in computer science.

[27]  Diego H. Milone,et al.  Restricciones de funcionamiento en tiempo real de un sistema automático de diálogo , 2000 .

[28]  Niels Ole Bernsen,et al.  Current practice in the development and evaluation of spoken language dialogue systems , 1999, EUROSPEECH.

[29]  Lori Lamel,et al.  The LIMSI ARISE system , 2000, Speech Commun..

[30]  Thomas Niesler,et al.  Variable-length categoryn-gram language models , 1999, Comput. Speech Lang..

[31]  Amir Najmi,et al.  An interactive dialog system for learning Japanese , 2000, Speech Commun..

[32]  Hauke Schramm,et al.  The thoughtful elephant: strategies for spoken dialog systems , 2000, IEEE Trans. Speech Audio Process..

[33]  Anne H. Anderson,et al.  Proceedings of Eurospeech , 2003, ISCA 2003.

[34]  Ramón López-Cózar,et al.  Evaluation of a Dialogue System Based on a Generic Model that Combines Robust Speech Understanding and Mixed-initiative Control , 2000, LREC.

[35]  Konrad Scheffler,et al.  Probabilistic simulation of human-machine dialogues , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[36]  Richard M. Stern,et al.  A vector Taylor series approach for environment-independent speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[37]  Mike Edgington,et al.  OASIS - a framework for spoken language call steering , 1999, EUROSPEECH.

[38]  Richard M. Stern,et al.  COMPENSATION FOR ENVIRONMENTAL DEGRADATION IN AUTOMATIC SPEECH RECOGNITION , 1999 .

[39]  Saeed Vaseghi,et al.  Speech recognition in noisy environments , 1992, ICSLP.

[40]  Guy Perennou,et al.  Language model level vs. lexical level for modeling pronunciation variation in a French CSR , 1999, EUROSPEECH.

[41]  Karsten Schröder,et al.  Standardised speech interfaces - key for objective evaluation of recognition accuracy , 1999, EUROSPEECH.

[42]  James Glass,et al.  Evaluation methodology for a telephone-based conversational system , 1998 .

[43]  Frédéric Béchet,et al.  A language model combining n-grams and stochastic finite state automata , 1999, EUROSPEECH.

[44]  Marcello Federico A system for the retrieval of Italian broadcast news , 2000, Speech Commun..

[45]  Dafydd Gibbon,et al.  Handbook of Multimodal and Spoken Dialogue Systems , 2000 .