Experimental Evaluation of Bi-directional Multimodal Interaction with Conversational Agents

In the field of intuitive HCI, Embodied Conversational Agents (ECAs) are being developed mostly with speech input. In this paper, we study whether another input modality leads to a more effective and pleasant “bi-directional” multimodal communication. In a Wizard-of-Oz experiment, adults and children were videotaped while interacting with 2D animated agents within a game application. Each subject carried out a multimodal scenario (speech and/or pen input) and a speech-only scenario. The results confirm the usefulness of multimodal input, which yielded shorter scenarios, higher and more homogeneous ratings of easiness. Additional results underlined the importance of gesture interaction for children, and showed a modality specialization for certain actions. Finally, multidimensional analyses revealed links between behavioral and subjective data, such as an association of pen use and pleasantness for children. These results can be used for both developing the functional prototype and in the general framework of ECA-systems evaluation and specification.

[1]  J. Cassell,et al.  More Than Just Another Pretty Face: Embodied Conversational Interface Agents , 1999 .

[2]  Franck Multon,et al.  Multimodal References in GEORAL TACTILE , 1997 .

[3]  Kristinn R. Thórisson,et al.  The Power of a Nod and a Glance: Envelope Vs. Emotional Feedback in Animated Conversational Agents , 1999, Appl. Artif. Intell..

[4]  Louis Porcher,et al.  Geste et communication , 1989 .

[5]  Sharon L. Oviatt Talking to thimble jellies: children²s conversational speech with animated characters , 2000, INTERSPEECH.

[6]  Susanne van Mulken,et al.  The impact of animated interface agents: a review of empirical research , 2000, Int. J. Hum. Comput. Stud..

[7]  Thomas Rist,et al.  The Interactive CrossTalk Installation: Meta-Theater with Animated Presentation Agents , 2002 .

[8]  Michael Kipp,et al.  ANVIL - a generic annotation tool for multimodal dialogue , 2001, INTERSPEECH.

[9]  Mervyn A. Jack,et al.  Evaluating humanoid synthetic agents in e-retail applications , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[10]  David R. Traum,et al.  Embodied agents for multi-party dialogue in immersive virtual worlds , 2002, AAMAS '02.

[11]  Marc Cavazza,et al.  Interacting with virtual characters in interactive storytelling , 2002, AAMAS '02.

[12]  J. Cassell,et al.  Embodied conversational agents , 2000 .

[13]  Jean-Claude Martin,et al.  A Unified Framework for Constructing Multimodal Experiments and Applications , 1998, Cooperative Multimodal Communication.

[14]  James C. Lester,et al.  The Case for Social Agency in Computer-Based Teaching: Do Students Learn More Deeply When They Interact With Animated Pedagogical Agents? , 2001 .

[15]  Antonella De Angeli,et al.  Integration and synchronization of input modes during multimodal human-computer interaction , 1997, CHI.

[16]  Thomas Rist,et al.  Referring Phenomena in a Multimedia Context and their Computational Treatment , 1997 .

[17]  Sharon L. Oviatt,et al.  User-Centered Modeling for Spoken Language and Multimodal Interfaces , 1996, IEEE Multim..

[18]  Scotty D. Craig,et al.  Animated Pedagogical Agents in Multimedia Educational Environments: Effects of Agent Properties, Picture Features, and Redundancy , 2002 .

[19]  Catherine Pelachaud,et al.  Embodied contextual agent in information delivering application , 2002, AAMAS '02.