Co-operativity in Spoken Dialogue 1 Running head: CO-OPERATIVITY IN SPOKEN DIALOGUE Co-operativity in Human-Machine and Human-Human Spoken Dialogue

The paper presents principles of dialogue co-operativity derived from a corpus of task-oriented spoken human-machine dialogue. The corpus was recorded during the design of a dialogue model for a spoken language dialogue system. Analysis of the corpus produced a set of dialogue design principles intended to prevent users from having to initiate clarification and repair meta-communication which the system would not understand. Developed independently of Grice’s work on co-operation in spoken dialogue, these principles provide an empirical test of the correctness and completeness of Grice’s maxims of co-operativity in the case of human-machine dialogue. Whereas the maxims pass the test of correctness, they fail to provide a complete account of principles of co-operative human-machine dialogue. A more complete set of aspects of co-operative taskoriented dialogue is proposed together with the principles expressing those aspects. Transferability of results to co-operative spoken human-human dialogue is discussed. Co-operativity in Spoken Dialogue 3 In the last couple of years, we have designed and implemented the dialogue component of a spoken language dialogue system prototype in the domain of Danish domestic flight reservation. We are currently testing the system together with the partners in the project. As the aim of the project is to develop a realistic, application-oriented prototype, the issue of user-system co-operativity has played a central role throughout our work on designing and implementing the dialogue structure. The present paper presents results on co-operativity in spoken human-machine dialogue and comparisons with human-human dialogue. We argue that dialogue co-operativity is crucial to the design of spoken language dialogue systems (SLDSs). Realistic systems are characterised by limited linguistic skills, a limited vocabulary, limited knowledge of the world and limited ability to leave dialogue initiative with their users. They are not sensitive to prosodic features, such as intonation, vowel elongations, and pauses. They lack the average human's ability to draw inferences. The result is a largely system-directed dialogue rather than mixed initiative dialogue. While functionally adequate for a certain class of well-structured tasks, system-directed dialogue lacks the natural flexibility of the mixed-initiative dialogue that is characteristic of human-human interactions (Bernsen, Dybkjær & Dybkjær, 1994a; Bernsen, Dybkjær & Dybkjær, 1994b). SLDS designers are currently investigating how to tackle the difficult next step of enabling mixed initiative human-machine dialogue (Bernsen et al., 1994b; Dybkjær, Bernsen & Dybkjær, 1995; Goddeau, Brill, Glass, Pao, Phillips, Polifroni, Seneff & Zue, 1994; Peckham, 1993). It is possible today to approximate system-directed dialogue for fairly complex tasks. Despite being linguistically constrained, such systems behave naturally in terms of task-specific vocabulary, user input understanding that includes natural grammar and appropriate semantics, close-to natural management of discourse and close-to-real-time response. The technology thus enables the construction of Co-operativity in Spoken Dialogue 4 usable task-oriented SLDSs which are tolerably inferior to the humans they replace, despite the fact that dialogue with such a system is, in effect, conversation with an idiot savant. In addition to our own system, examples of such systems are presented in Oerder and Aust (1994), Cole, Novick, Fanty, Vermeulen, Sutton, Burnett & Schalkwyk (1994), and Mazor, Braun, Ziegler, Lerner, Feng & Zhou (1994). A crucial point in what follows, however, is that system-directed dialogue breaks down when users ask questions of the system. A key, therefore, to the successful design of system-directed dialogue is to design the dialogue in such a way that users do not need to ask questions of the system. To do this, we claim, requires optimising the dialogue co-operativity of the system. Given that dialogue initiative lies mainly with the current SLDSs, dialogue designers have to take every possible precaution to minimise the number of situations in which users are inclined to initiate meta-communication for purposes of clarification and repair. Meta-communication is communication on the dialogue itself rather than on the task domain of the dialogue. Human-human dialogue both allows for and is greatly assisted by clarification and repair meta-communication. If we are in doubt as to what our interlocutor said or meant, why a particular topic was raised, why it was raised at that particular point, or why it was raised in a particular way during dialogue, we initiate clarification and repair meta-communication to find out. Similarly, speakers often take advantage of the fact that their partners can demand elaboration at any point. This helps fine-tuning the speaker's contributions and indicates interest from the partner. The standard way of initiating clarification and repair meta-communication is by asking questions of the interlocutor (Schegloff, Jefferson & Sacks, 1977). In largely system-directed dialogue, however, userinitiated clarification and repair meta-communication must be either avoided or restricted to the use of well-defined user commands, such as ‘correct’ or ‘repeat’, because the system is unable to understand unrestricted meta-communication. The achievement of mixed-initiative Co-operativity in Spoken Dialogue 5 meta-communication dialogue is an important goal in SLDS design which lies beyond the scope of this paper. When the dialogue has been designed to optimise co-operativity, users do not need to ask meta-communicative questions of the system in order to understand it (Bilange, 1991; Eckert & McGlashan, 1993). If the system's contributions were already fine-tuned to conform to the human interlocutor's expectations, there would be no need for clarification and repair meta-communication which the user cannot manage by invoking simple mechanisms, such as the keywords 'correct' and 'repeat'. In order to optimise user-system cooperativity, we developed a set of general usability principles to be observed in co-operative human-machine dialogue design. This made it possible to apply the principles in our dialogue design and, just as importantly, to re-use the same principles in other human-machine dialogue design efforts. Having developed and applied the principles, we became aware of the link between our work and Grice’s Co-operative Principle and maxims (Grice, 1975). According to Grice’s Co-operative Principle (CP), to act co-operatively in conversation, one should make one’s “conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which one is engaged” (Grice, 1975, p.26). Grice proposes that the Co-operative Principle can be explicated in terms of maxims of co-operative human-human dialogue as discussed later. Although Grice’s maxims have been conceived with a different purpose in mind, they can be seen as serving the same objective as do our principles, namely that of preventing interlocutorinitiated clarification and repair meta-communication. From this viewpoint, the main difference between Grice’s work and ours is that the maxims were developed to account for co-operativity in human-human dialogue rather than in human-machine dialogue, whereas our principles were developed from analysis of a corpus of simulated human-machine dialogues. Another point of potential interest is that, at least superficially, our set of Co-operativity in Spoken Dialogue 6 principles is considerably larger than Grice’s set of maxims. These differences provide an opportunity to (a) test how Grice’s theory of dialogue co-operativity works in the domain of highly restricted human-machine dialogue; (b) compare co-operative human-human dialogue with co-operative human-machine dialogue; and, (c) potentially augment the basis of a theory of spoken dialogue co-operativity. It would be useful to briefly summarise some main differences between human-human and human-computer task-oriented dialogue. The latter consists of highly specialised mutualgoal exchanges between partners of vastly different skills of language and comprehension (cf. the system limitations noted above). Yet the inferior partner controls the dialogue. This rarely occurs in the human-human exchanges which are closest to the human-computer interactions we are considering. In those exchanges, the interlocutor with superior skills normally takes over the dialogue initiative and simplifies vocabulary, makes own contributions more explicit, and asks more questions. The reason why, in our human-computer interactions, the inferior dialogue partner may control the dialogue is that this partner is the domain expert. This combination of speaker properties seems different from anything found in human-human dialogue, even from an exchange with an esteemed foreign expert. In addition, the human-computer dialogue has no role for prosodic features. Still, the dialogue serves the accomplishment of the task. This is only possible because the human interlocutors understand how to perform the task without the use of prosody and, more generally, display enough flexibility in dialogue to communicate with a partner that is more idiot savant than any human interlocutor could ever be. The next section describes how our principles were developed. The principles are then presented. The subsequent section presents Grice’s theory of co-operativity and situates the comparative analysis to follow. This completes the preparations for providing a detailed analysis of the relationship between the two sets of principles. The result is an account of cooperativity in spoken human-machine dialogue whose applicability to human-human dialogue Co-operativity in Spoken Dialogue 7 is discussed in the concluding section. For ease of reference, the term maxim will refer to Grice’s

[1]  Eric Bilange,et al.  A Task Independent Oral Dialogue Model , 1991, EACL.

[2]  Niels Ole Bernsen,et al.  Wizard-of-oz and the trade-off between naturalness and recogniser constraints , 1993, EUROSPEECH.

[3]  Nigel Gilbert,et al.  Simulating speech systems , 1991 .

[4]  Yan Huang,et al.  A neo-Gricean pragmatic theory of anaphora , 1991 .

[5]  Niels Ole Bernsen,et al.  Exploring the limits of system-directed dialogue, dialogue evaluation of the danish dialogue system , 1995, EUROSPEECH.

[6]  D. Over,et al.  Studies in the Way of Words. , 1989 .

[7]  Han Zhou,et al.  OASIS - a speech recognition system for telephone service orders , 1994, ICSLP.

[8]  Srikant Sarangi,et al.  Non-cooperation in communication: A reassessment of Gricean pragmatics , 1992 .

[9]  E. Schegloff,et al.  The preference for self-correction in the organization of repair in conversation , 1977 .

[10]  Stephen C. Levinson,et al.  Minimization and conversational inference , 1987 .

[11]  Jeremy Peckham,et al.  A new generation of spoken dialogue systems: results and lessons from the sundial project , 1993, EUROSPEECH.

[12]  Harald Aust,et al.  A realtime prototype of an automatic inquiry system , 1994, ICSLP.

[13]  S. Levinson Pragmatics and the grammar of anaphora: a partial pragmatic reduction of Binding and Control phenomena , 1987, Journal of Linguistics.

[14]  Ronald A. Cole,et al.  A prototype voice-response questionnaire for the u.s. census , 1994, ICSLP.

[15]  Richard E. Grandy On Grice on Language , 1989 .

[16]  Niels Ole Bernsen,et al.  A dedicated task-oriented dialogue theory in support of spoken language dialogue systems design , 1994, ICSLP.

[17]  Victor Zue,et al.  GALAXY: a human-language interface to on-line travel information , 1994, ICSLP.

[18]  D. Sperber,et al.  Précis of Relevance: Communication and Cognition , 1987 .

[19]  Scott McGlashan,et al.  Managing spoken dialogues for information services , 1993, EUROSPEECH.

[20]  H. Grice Further Notes on Logic and Conversation , 1978 .