THE UTILITY OF ELAPSED TIME AS A USABILITY METRIC FOR SPOKEN DIALOGUE SYSTEMS

It is commonly assumed that elapsed time is an important objective metric for evaluating the performance of spoken dialogue systems. However, our studies based on the PARADISE framework consistently find that other predictors are stronger contributors to user satisfaction than elapsed time. In this paper, we show that several possible explanations for this apparently counter-intuitive finding are not feasible. Our conclusion is that users of spoken dialogue systems are as much or more attuned to qualitative aspects of the interaction as they are to elapsed time.

[1]  Elizabeth Shriberg,et al.  Subject-Based Evaluation Measures for Interactive Spoken Language Systems , 1992, HLT.

[2]  Marilyn A. Walker,et al.  Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email , 1998, COLING-ACL.

[3]  Morena Danieli,et al.  Metrics for Evaluating Dialogue Strategies in a Spoken Language System , 1996, ArXiv.

[4]  Marilyn A. Walker,et al.  Evaluating Response Strategies in a Web-Based Spoken Dialogue Agent , 1998, ACL.

[5]  Marilyn A. Walker,et al.  From novice to expert: the effect of tutorials on user expertise with spoken dialogue systems , 1998, ICSLP.

[6]  E. Russell Ritenour,et al.  Evaluating spoken dialog systems for telecommunication services , 1997, EUROSPEECH.

[7]  James Glass,et al.  Evaluation methodology for a telephone-based conversational system , 1998 .

[8]  Lynette Hirschman,et al.  The cost of errors in a spoken language system , 1993, EUROSPEECH.

[9]  Roberto Pieraccini,et al.  A stochastic model of computer-human interaction for learning dialogue strategies , 1997, EUROSPEECH.

[10]  Lars Bo Larsen,et al.  Combining Objective and Subjective Data in Evaluation of Spoken Dialogues , 1999 .

[11]  Marilyn A. Walker,et al.  Design and evaluation of spoken dialog systems , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[12]  Alexander I. Rudnicky Factors affecting choice of speech over keyboard and mouse in a simple data-retrieval task , 1993, EUROSPEECH.

[13]  Marilyn A. Walker,et al.  Evaluating competing agent strategies for a voice email agent , 1997, EUROSPEECH.