Expanding a time-sensitive conversational architecture for turn-taking to handle content-driven interruption

Turn taking in spoken language systems has generally been push-to-talk or strict alternation (user speaks, system speaks, user speaks, …) with some systems such as telephone-based systems handling barge-in (interruption by the user.) In this paper we describe our time sensitive conversational architecture for turn taking that not only allows alternating turns and barge in, but other conversational behaviors as well. This architecture allows backchanneling, prompting the user by taking more than one turn if necessary, and overlapping speech. The architecture is implemented in a Reading Tutor that listens to children read aloud, and helps them. We extended this architecture to allow the Reading Tutor to interrupt the student based on a non-self-corrected mistake – “content-driven interruption”. To the best of our knowledge, the Reading Tutor is thus the first spoken language system to intentionally interrupt the user based on the content of the utterance. 1. MOTIVATION Rich turn-taking is a ubiquitous feature of human-human spoken dialog. Rather than merely alternate between speakers, people backchannel, take multiple turns, interrupt each other, and finish each others’ sentences (Fox 1993, Sacks et al. 1974, Duncan 1972). In tutorial dialog, rich turn-taking plays a substantial role in pedagogical effectiveness (Fox 1993). For example, the amoung of time that a teacher waits after asking a question before answering her own question affects student learning (Stahl 1994, Rowe 1972) – wait times of more than three seconds lead to better student learning (Tobin 1987, Tobin 1986). In order to expand the capabilities of spoken dialog systems to handle rich turn-taking, we started with a domain with a relatively simple content-based discourse model – oral reading tutoring – that is also interesting and important in its own right. Because of the simple content of the interaction, we are able to focus on other aspects of the dialog – specifically, turn-taking behavior.

[1]  Gregory Aist Challenges for a Mixed Initiative Spoken Dialog System for Oral Reading Tutoring , 1997 .

[2]  Mary Budd Rowe,et al.  Wait-Time and Rewards as Instructional Variables: Their Influence on Language, Logic, and Fate Control. , 1972 .

[3]  Jack Mostow,et al.  Demonstration of a reading coach that listens , 1995, UIST '95.

[4]  K. Tobin The Role of Wait Time in Higher Cognitive Level Learning , 1987 .

[5]  R. Stahl Using "Think-Time" and "Wait-Time" Skillfully in the Classroom , 1994 .

[6]  Martin J. Russell,et al.  Applications of automatic speech recognition to speech and language development in young children , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  K. Tobin Effects of Teacher Wait Time on Discourse Characteristics in Mathematics and Language Arts Classes , 1986 .

[8]  Jared Bernstein,et al.  A voice interactive language instruction system , 1991, EUROSPEECH.

[9]  Jack Mostow,et al.  Towards a Reading Coach that Listens: Automated Detection of Oral Reading Errors , 1993, AAAI.

[10]  Jack Mostow,et al.  The Sounds of Silence: Towards Automated Evaluation of Student Learning in a Reading Tutor that Listens , 1997, AAAI/IAAI.

[11]  S. Duncan,et al.  Some Signals and Rules for Taking Speaking Turns in Conversations , 1972 .

[12]  Jack Mostow,et al.  Adapting Human Tutorial Interventions for a Reading Tutor that Listens: Using Continuous Speech Recognition in Interactive Educational Multimedia , 1997 .

[13]  Jack Mostow,et al.  A Prototype Reading Coach that Listens , 1994, AAAI.

[14]  Margaret B Row Wait-Time and Rewards as Instructional Variables, Their Influence on Language, Logic, and Fate Control: Part One--Wait-Time. , 1974 .

[15]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 1974 .

[16]  Barbara A. Fox The Human Tutorial Dialogue Project: Issues in the Design of instructional Systems , 1993 .