Natural interaction with a virtual guide in a virtual environment

This paper describes the Virtual Guide, a multimodal dialogue system represented by an embodied conversational agent that can help users to find their way in a virtual environment, while adapting its affective linguistic style to that of the user. We discuss the modular architecture of the system, and describe the entire loop from multimodal input analysis to multimodal output generation. We also describe how the Virtual Guide detects the level of politeness of the user’s utterances in real-time during the dialogue and aligns its own language to that of the user, using different politeness strategies. Finally we report on our first user tests, and discuss some potential extensions to improve the system.

[1]  Stefan Kopp,et al.  Trading Spaces: How Humans and Humanoids Use Speech and Gesture to Give Directions , 2007 .

[2]  Elisabeth André,et al.  Informing the Design of Embodied Conversational Agents by Analyzing Multimodal Politeness Behaviors in Human-Human Communication , 2005 .

[3]  Oliver Lemon,et al.  Learning Lexical Alignment Policies for Generating Referring Expressions for Spoken Dialogue Systems , 2009, ENLG.

[4]  James F. Allen,et al.  Draft of DAMSL Dialog Act Markup in Several Layers , 2007 .

[5]  Wolfgang Minker,et al.  Endowing Spoken Language Dialogue Systems with Emotional Intelligence , 2004, ADS.

[6]  Sharon L. Oviatt,et al.  Multimodal Integration - A Statistical View , 1999, IEEE Trans. Multim..

[7]  Marilyn A. Walker,et al.  POLLy: A Conversational System that uses a Shared Representation to Generate Action and Social Language , 2008, IJCNLP.

[8]  Anton Nijholt,et al.  Presenting in Virtual Worlds: Towards an Architecture for a 3D Presenter Explaining 2D-Presented Information , 2005, INTETAIN.

[9]  Rieks op den Akker,et al.  Dialogue act recognition under uncertainty using Bayesian networks , 2007, Natural Language Engineering.

[10]  Mariët Theune,et al.  The virtual guide: a direction giving embodied conversational agent , 2007, INTERSPEECH.

[11]  Sharon L. Oviatt,et al.  Taming recognition errors with a multimodal interface , 2000, CACM.

[12]  Penelope Brown,et al.  Politeness: Some Universals in Language Usage , 1989 .

[13]  Michael White,et al.  EXEMPLARS: A Practical, Extensible Framework For Dynamic Text Generation , 1998, INLG.

[14]  Mariët Theune,et al.  Politeness and alignment in dialogues with a virtual guide , 2008, AAMAS.

[15]  Emile H. L. Aarts,et al.  True Visions: The Emergence of Ambient Intelligence , 2008 .

[16]  Alfred Kobsa,et al.  User Interfaces for All , 1999 .

[17]  Louis Vuurpijl,et al.  Multimodal Interaction in Architectural Design Applications , 2004, User Interfaces for All.

[18]  Justine Cassell,et al.  BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.

[19]  Adam Cheyer,et al.  The Open Agent Architecture , 1997, Autonomous Agents and Multi-Agent Systems.

[20]  Philip R. Cohen,et al.  MULTIMODAL INTERFACES THAT PROCESS WHAT COMES NATURALLY , 2000 .

[21]  Shalom Lappin,et al.  An Algorithm for Pronominal Anaphora Resolution , 1994, CL.

[22]  Paul Thompson,et al.  Learning to Classify Utterances in a Task-Oriented Dialogue , 2003 .

[23]  Klaas Sikkel,et al.  Predictive Head-Corner Chart Parsing , 1993, IWPT.

[24]  Chris Mellish,et al.  Modelling Politeness in Natural Language Generation , 2004, INLG.

[25]  Robert Hubal,et al.  Extracting Emotional Information from the Text of Spoken Dialog , 2003 .

[26]  Anton Nijholt,et al.  Jacob - An Animated Instruction Agent in Virtual Reality , 2000, ICMI.

[27]  Justine Cassell,et al.  Negotiated Collusion: Modeling Social Language and its Relationship Effects in Intelligent Agents , 2003, User Modeling and User-Adapted Interaction.

[28]  Danilo P. Mandic,et al.  Engineering Approaches to Conversational Informatics , 2008 .

[29]  John D. Kelleher,et al.  Applying Computational Models of Spatial Prepositions to Visually Situated Dialog , 2009, CL.

[30]  Kristiina Jokinen,et al.  Distributed Dialogue Management in a Blackboard Architecture , 2003 .

[31]  Timothy W. Bickmore,et al.  'It's just like you talk to a friend' relational agents for older adults , 2005, Interact. Comput..

[32]  Robert Dale,et al.  Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions , 1995, Cogn. Sci..

[33]  Yorick Wilks,et al.  Multimodal Dialogue Management in the COMIC Project , 2003 .

[34]  Roel Vismans,et al.  Modal particles in Dutch directives : a study in functional grammar , 1994 .

[35]  Marilyn A. Walker,et al.  Generating Politeness in Task Based Interaction: An Evaluation of the Effect of Linguistic Form and Culture , 2007, ENLG.

[36]  Anton Nijholt,et al.  Presenting in Virtual Worlds: An Architecture for a 3D Anthropomorphic Presenter , 2006, IEEE Intelligent Systems.

[37]  Marilyn A. Walker,et al.  Improvising linguistic style: social and affective bases for agent personality , 1997, AGENTS '97.

[38]  Hans-Peter Seidel,et al.  Annotated New Text Engine Animation Animation Lexicon Animation Gesture Profiles MR : . . . JL : . . . Gesture Generation Video Annotated Gesture Script , 2007 .

[39]  Niels Ole Bernsen Managing Domain-Oriented Spoken Conversation , 2004 .

[40]  M. Pickering,et al.  Toward a mechanistic psychology of dialogue , 2004, Behavioral and Brain Sciences.

[41]  Stefan Kopp,et al.  An Alignment-Capable Microplanner for Natural Language Generation , 2009, ENLG.

[42]  Wolfgang Wahlster,et al.  Multi-modal human–environment interaction , 2006 .

[43]  Sharon L. Oviatt,et al.  Perceptual user interfaces: multimodal interfaces that process what comes naturally , 2000, CACM.

[44]  Tieniu Tan,et al.  Advances in Multimodal Interfaces — ICMI 2000 , 2001, Lecture Notes in Computer Science.

[45]  Amy Isard,et al.  Individuality and Alignment in Generated Dialogues , 2006, INLG.

[46]  Bobby Bodenheimer,et al.  Synthesis and evaluation of linear motion transitions , 2008, TOGS.

[47]  Ning Wang,et al.  The politeness effect: Pedagogical agents and learning outcomes , 2008, Int. J. Hum. Comput. Stud..

[48]  Ramon Martí,et al.  Secure Integration of Distributed Medical Data Using Mobile Agents , 2006, IEEE Intelligent Systems.

[49]  Stanley Peters,et al.  The WITAS multi-modal dialogue system I , 2001, INTERSPEECH.

[50]  Janet E. Cahn,et al.  Improvising Linguistic Style: Social and Affective Bases for Agents. , 1997 .

[51]  Cecile Paris,et al.  Adaptation to affective factors: architectural impacts for natural language generation and dialogue , 2005 .