论文信息 - Incremental, Adaptive and Interruptive Speech Realization for Fluent Conversation with ECAs

Incremental, Adaptive and Interruptive Speech Realization for Fluent Conversation with ECAs

Human conversations are highly dynamic, responsive interactions. In such interactions, utterances are produced incrementally, subject to on-the-fly adaptation (e.g. speaking louder to keep a challenged turn) and (self) interruptions. While listening, plans for next speaking contributions are constructed, allowing very rapid turn transitions. To enable such fluent interaction in Embodied Conversational Agents (ECAs) we must steer away from the traditional turn-based non-incremental interaction paradigm in which the ECA first fully analyzes user contributions and subsequently fully plans its contribution, which is then executed entirely ballistically (providing no adaptation in nor interruption of ongoing behavior). Recently, several systems have done exactly this and introduce one or more aspects of incrementality, interruptibility, or adaptivity. Their focus is mostly on behavior planning and they introduce a limited set of behavior realization capabilities only where it helps illustrate their flexible planning strategies. Furthermore, many of these systems can be characterized as proof-of-concepts, designed for a single purpose, domain (for example, only generating backchannel feedback) or set of experiments. The focus of our ongoing work is to provide a comprehensive architecture that unifies the fluent behavior realization functionality of these more experimental systems and additionally can reproduce other important phenomena that occur in fluent dialog. Our architecture serves 1) as a platform for those experimenting with the ‘best’ way to deploy fluent behavior realization strategies or those researching social effects of certain deployment strategies and 2) as a building block (specifically the Behavior Realizer) in a ECA architecture that supports fluent human-ECA interaction. To this end, we provide behavior planners and human authors of behavior realization strategies with a language for specifying behavior realization plans that allow fluent interaction on an ECA. Our specification language and realizer implementation provides:

Stefan Kopp | David Schlangen | Herwin van Welbergen | Timo Baumann

[1] David Schlangen,et al. INPRO_iSS: A Component for Just-In-Time Incremental Speech Synthesis , 2012, ACL.

[2] Stefan Kopp,et al. An Incremental Multimodal Realizer for Behavior Co-Articulation and Coordination , 2012, IVA.