An Experiment Setup for Collecting Data for Adaptive Output Planning in a Multimodal Dialogue System

We describe a Wizard-of-Oz experiment setup for the collection of multimodal interaction data for a Music Player application. This setup was developed and used to collect experimental data as part of a project aimed at building a flexible multimodal dialogue system which provides an interface to an MP3 player, combining speech and screen input and output. Besides the usual goal of WOZ data collection to get realistic examples of the behavior and expectations of the users, an equally important goal for us was to observe natural behavior of multiple wizards in order to guide our system development. The wizards’ responses were therefore not constrained by a script. One of the challenges we had to address was to allow the wizards to produce varied screen output a in real time. Our setup includes a preliminary screen output planning module, which prepares several versions of possible screen output. The wizards were free to speak, and/or to select a screen output.

[1]  Ulrich Trk The technical processing in smartkom data collection: a case study , 2001, INTERSPEECH.

[2]  Niels Ole Bernsen,et al.  Exploring Natural Interaction in the Car , 2001 .

[3]  Johanna D. Moore,et al.  Generating Tailored, Comparative Descriptions in Spoken Dialogue , 2004, FLAIRS Conference.

[4]  Gabriel Skantze Exploring Human Error Handling Strategies : Implications for Spoken Dialogue Systems , 2003 .

[5]  Lynette Hirschman,et al.  Comparing Several Aspects of Human-Computer and Human-Human Dialogues , 2001, SIGDIAL Workshop.

[6]  Eric Horvitz,et al.  Conversation as Action Under Uncertainty , 2000, UAI.

[7]  Niels Ole Bernsen,et al.  Designing interactive speech systems - from first ideas to user testing , 1998 .

[8]  Steve J. Young,et al.  Characterizing task-oriented dialog using a simulated ASR chanel , 2004, INTERSPEECH.

[9]  Julia Hirschberg,et al.  Proceedings of the ISCA Tutorial and Research Workshop on Error Handling in Spoken Dialogue Systems , 2003 .

[10]  Maxine Eskénazi,et al.  Let's go public! taking a spoken dialog system to the real world , 2005, INTERSPEECH.

[11]  Steve J. Young,et al.  A framework for dialogue data collection with a simulated ASR channel , 2004, INTERSPEECH.

[12]  Douglas B. Moran,et al.  The Open Agent Architecture: A Framework for Building Distributed Software Systems , 1999, Appl. Artif. Intell..

[13]  Gabriel Skantze,et al.  Exploring human error recovery strategies: Implications for spoken dialogue systems , 2005, Speech Communication.

[14]  Verena Rieser,et al.  Strategies for Flexible Multimodal Interaction with a Music Player , 2005 .