Scenario Design for Spoken Language Dialogue Systems Development

Adequate data acquired through the Wizard of Oz experimental prototyping method are still crucial to the cost-effective development of advanced spoken language dialogue systems. One important source of data corruption is the unintended priming of subjects through the task scenario representations used in the experiments. The paper presents the three sets of development and test scenario representations which were used in the Danish Dialogue project. Based on the third set of scenarios an experiment was conducted to investigate the effects of a masking strategy which effectively avoids the possibility of priming the WOZ subjects. The experimental results are presented and discussed. 1. THE ROLE OF SCENARIOS IN SPOKEN LANGUAGE DIALOGUE SYSTEMS DESIGN Scenarios are important tools in spoken language dialogue systems (SLDSs) development and testing. Nonetheless, the SLDS literature has little to say about scenario design and on the many problems to be aware of. This paper presents conclusions from the Danish Dialogue project as regards the construction, representation and use of scenarios in SLDS design. Over the last three years, the authors have designed and implemented the dialogue part of a realistic SLDS prototype, P2, which has been developed in collaboration with the Center for PersonKommunikation at Aalborg University and the Centre for Language Technology in Copenhagen. The domain of P2 is Danish domestic airline ticket reservation. The P2 dialogue model was developed by means of the Wizard of Oz (WOZ) experimental prototyping method [3, 5, 6]. WOZ is an iterative process of testing and revising the dialogue model, which continues until the model is found acceptable for implementation. The implemented dialogue model is subjected to further testing. Each of these tests requires the use of predefined scenarios. The purpose of using scenarios is to develop and test the dialogue model on the basis of realistic situations of use of the SLDS under construction. Scenarios prescribe tasks embedded in realistic situations of use, which subjects, i.e. the persons acting as users, are asked to perform through spoken dialogue with the system. The scenario-based dialogues provide crucial data on user-system behaviour during dialogue, i.e. on user reactions to various aspects of the system’s behaviour and vice versa, as well as on users’ sublanguage vocabulary, utterance length, dialogue act types, number of turns per scenario, grammatical complexity, utterance ungrammaticality, task ordering preferences, problem-solving strategies, etc. An additional aim in using scenarios is to achieve some amount of systematicity in the testing process. There is, however, no known method for designing scenarios which are representative of all possible situations of use of the artefact being designed [7]. So the basic problem in scenario design is to capture, in a limited set of scenarios, as much as possible of the space of possible situations of use. 2. THE P2 DEVELOPMENT AND TEST