Situated interaction with a virtual human - perception, action, and cognition

In Virtual Reality environments, real humans can meet virtual humans to collaborate on tasks. The agent Max is such a virtual human, providing the human user with a face-to-face collaboration partner in the SFB 360 construction tasks. This paper describes how Max can assist by combining manipulative capabilities for assembly actions with conversational capabilities for mixed-initiative dialogue. During the interaction, Max employs speech, gaze, facial expression, and gesture and is able to initiate assembly actions. We present the underlying model of Max’s competences for managing situated interactions, and we show how the required faculties of perception, action, and cognition are realized and connected in his architecture.

[1]  Martine Grice,et al.  Deutsche Intonation und GToBI , 2002 .

[2]  Herbert H. Clark,et al.  Grounding in communication , 1991, Perspectives on socially shared cognition.

[3]  Marc Erich Latoschik,et al.  Knowledge-based assembly simulation for virtual prototype modeling , 1998, IECON '98. Proceedings of the 24th Annual Conference of the IEEE Industrial Electronics Society (Cat. No.98CH36200).

[4]  W. Lewis Johnson,et al.  Animated Agents for Procedural Training in Virtual Reality: Perception, Cognition, and Motor Control , 1999, Appl. Artif. Intell..

[5]  Ipke Wachsmuth,et al.  An implemented approach for a visual programming environment in VR , 2003 .

[6]  David R. Traum,et al.  Embodied agents for multi-party dialogue in immersive virtual worlds , 2002, AAMAS '02.

[7]  Candace L. Sidner,et al.  Engagement During Dialogues with Robots , 2005 .

[8]  Franz Kummert,et al.  Incremental speech recognition for multimodal interfaces , 1998, IECON '98. Proceedings of the 24th Annual Conference of the IEEE Industrial Electronics Society (Cat. No.98CH36200).

[9]  Norman I. Badler,et al.  Creating Interactive Virtual Humans: Some Assembly Required , 2002, IEEE Intell. Syst..

[10]  Ipke Wachsmuth,et al.  Towards a cognitively motivated processing of turn-taking signals for the embodied conversational agent Max , 2004 .

[11]  Philip R. Cohen,et al.  Plans as Complex Mental Attitudes , 2003 .

[12]  Bernhard Jung,et al.  Reasoning about Objects, Assemblies, and Roles in On-Going Assembly Tasks , 1998, DARS.

[13]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[14]  Marc Erich Latoschik A gesture processing framework for multimodal interaction in virtual reality , 2001, AFRIGRAPH '01.

[15]  Catherine Pelachaud,et al.  Performative facial expressions in animated faces , 2001 .

[16]  L SidnerCandace,et al.  Attention, intentions, and the structure of discourse , 1986 .

[17]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[18]  Ipke Wachsmuth,et al.  Extending semantic long-term knowledge on the basis of episodic short-term knowledge , 2003 .

[19]  Ipke Wachsmuth,et al.  Max - A Multimodal Assistant in Virtual Reality Construction , 2003, Künstliche Intell..

[20]  David Traum,et al.  CONVERSATIONAL AGENCY: THE TRAINS-93 DIALOGUE MANAGER , 2007 .

[21]  Marcus J. Huber JAM: a BDI-theoretic mobile agent architecture , 1999, AGENTS '99.

[22]  Stanley Peters,et al.  A multi-modal dialogue system for human-robot conversation , 2001, HTL 2001.

[23]  Candace L. Sidner,et al.  Discourse structure and intention recognition. , 2000 .

[24]  Staffan Larsson,et al.  GoDiS- An Accommodating Dialogue System , 2000 .

[25]  Allen Newell,et al.  SOAR: An Architecture for General Intelligence , 1987, Artif. Intell..

[26]  Stefan Kopp,et al.  MURML: A Multimodal Utterance Representation Markup Language for Conversational Agents , 2002 .

[27]  David Traum,et al.  The Information State Approach to Dialogue Management , 2003 .

[28]  Marc Erich Latoschik,et al.  Resolving object references in multimodal dialogues for immersive virtual environments , 2004, IEEE Virtual Reality 2004.

[29]  Michael F. McTear,et al.  Book Review: Spoken Dialogue Technology: Toward the Conversational User Interface, by Michael F. McTear , 2002, CL.

[30]  Peter A. Heeman,et al.  Reconciling Initiative and Discourse Structure , 2001, SIGDIAL Workshop.

[31]  Michael E. Bratman,et al.  Intention, Plans, and Practical Reason , 1991 .

[32]  Michael Winikoff,et al.  Declarative and procedural goals in intelligent agent systems , 2002, KR 2002.

[33]  Justine Cassell,et al.  Human conversation as a system framework: designing embodied conversational agents , 2001 .

[34]  C. Lebiere,et al.  The Atomic Components of Thought , 1998 .

[35]  Eduard Hovy,et al.  NL Generation for Virtual Humans in a Complex Social Environment , 2003 .

[36]  Stefan Kopp,et al.  Synthesizing multimodal utterances for conversational agents , 2004, Comput. Animat. Virtual Worlds.

[37]  Nate Blaylock,et al.  Managing Communicative Intentions with Collaborative Problem Solving , 2003 .

[38]  David R. Traum,et al.  A Reactive-Deliberative Model of Dialogue Agency , 1996, ATAL.

[39]  Stephanie D. Teasley,et al.  Perspectives on socially shared cognition , 1991 .

[40]  Edmund H. Durfee,et al.  UM-PRS: An implementation of the procedural reasoning system for multirobot applications , 1994 .

[41]  Stefan Kopp,et al.  Towards integrated microplanning of language and iconic gesture for multimodal output , 2004, ICMI '04.

[42]  M. Winikoff,et al.  Declarative & Procedural Goals in Intelligent Agent Systems , 2002, KR.

[43]  John F. Sowa,et al.  Knowledge Representation and Reasoning , 2000 .

[44]  Stefan Kopp,et al.  Simulating the Emotion Dynamics of a Multimodal Conversational Agent , 2004, ADS.

[45]  David R. Traum,et al.  Discourse Obligations in Dialogue Processing , 1994, ACL.

[46]  J. Allwood Linguistic communication as action and cooperation : a study in pragmatics , 1976 .

[47]  John R. Searle,et al.  Speech Acts: An Essay in the Philosophy of Language , 1970 .