Towards Experimental Specification and Evaluation of Lifelike Multimodal Behavior

ABSTRACT In this paper we introduce theLimsi Embodied Agent project which tackles the following issues of Embodied ConversationalAgent (ECA) specification and evaluation: the need to groundECA’s behavior on video-taped annotations of applicationdependent human behavior, the granularity of the language forspecifying the ECAmultimodal behavior, and the evaluation of the use of ECA in Human-Computer Interaction. In this paper,we describe preliminary work and future directions in each ofthese issues. Categories and Subject Descriptors H.5.2-H.5.1 Information Interfaces and Presentation[ ]: UserInterface – interaction styles, standardization, ergonomics, userinterface management systems. Multimedia Information Systems– evaluation/methodology. General Terms Design, Experimentation, Human Factors, Standardization. Keywords Multimodal interaction and integration,multimodal coding scheme. 1. INTRODUCTION There is still a lack of appropriate and global answers to thequestion of the “natural” behavior of Embodied ConversationalAgent (ECA). The specification of multimodal behavior of ECAis often based on knowledge extracted from the literature inseveral domains such as Psychology, Sociology and Linguistics.As partly suggested by [14] [6], we believe that in order to belifelike, multimodal behavior of agents needs to be grounded onexperimental studies in the same application context (i.e. themultimodal behavior of pedagogical ECA should be based onvideo recording and annotation of teacher’s behavior in“similar” settings). In this paper, we describe how we intend touse such an experimental approach with theLimsi Embodied Agent (LEA). But how do we go from annotating humanmultimodal behavior to specifying the behavior of an ECA?Existing specification languages are mostly dedicated either tolow-level monomodal specification (i.e. angry facial expression)or toamodal “higher” level specifications which are translatedinto monomodal features (i.e. angry behavior generating facialexpression, intonation, gaze…). In the LEA project, we definean intermediate level of specification based on types ofcooperation between communicative modalities which can beuseful for fine-grain specification and evaluation ofmultimodal communicative behavior based on video corpus annotation [20].Finally, we describe our global methodological frameworkwhich can be considered as a checklist for defining theevaluation process of ECAs.

[1]  Thomas Rist,et al.  Integrating reactive and scripted behaviors in a life-like presentation agent , 1998, AGENTS '98.

[2]  Yoichi Takebayashi,et al.  Spontaneous speech dialogue system TOSBURG II and its evaluation , 1994, Speech Communication.

[3]  James C. Lester,et al.  The Case for Social Agency in Computer-Based Teaching: Do Students Learn More Deeply When They Interact With Animated Pedagogical Agents? , 2001 .

[4]  Frank Guerin,et al.  Conversational Sales Assistants , 2000 .

[5]  Jonas Beskow,et al.  Developing and evaluating conversational agents , 2001 .

[6]  Dominique L. Scapin,et al.  Ergonomic criteria for evaluating the ergonomic quality of interactive systems , 1997, Behav. Inf. Technol..

[7]  Jean-Claude Martin,et al.  Multimodal and Adaptative Pedagogical Resources , 2002, LREC.

[8]  Yukiko I. Nakano,et al.  Non-Verbal Cues for Discourse Structure , 2022 .

[9]  Ana Paiva,et al.  The Storyteller: Building a Synthetic Character That Tells Stories , 2001 .

[10]  Björn Granström,et al.  Multimodal feedback cues in human-machine interactions , 2002, Speech Prosody 2002.

[11]  David R. Traum,et al.  Embodied agents for multi-party dialogue in immersive virtual worlds , 2002, AAMAS '02.

[12]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[13]  Yukiko I. Nakano,et al.  MACK: Media lab Autonomous Conversational Kiosk , 2002 .

[14]  Hao Yan,et al.  More than just a pretty face: conversational protocols and the affordances of embodiment , 2001, Knowl. Based Syst..

[15]  YolumPinar The fifth International conference on autonomous agents , 2001 .

[16]  Kristinn R. Thórisson,et al.  The Power of a Nod and a Glance: Envelope Vs. Emotional Feedback in Animated Conversational Agents , 1999, Appl. Artif. Intell..

[17]  Justine Cassell,et al.  Relational agents: a model and implementation of building user trust , 2001, CHI.

[18]  L Jean-ClaudeMARTIN On the Annotation of Multimodal Behavior and Computation of Cooperation Between Modalities , 2000 .

[19]  Mervyn A. Jack,et al.  Evaluating humanoid synthetic agents in e-retail applications , 2001, IEEE Trans. Syst. Man Cybern. Part A.

[20]  K. Chang,et al.  Embodiment in conversational interfaces: Rea , 1999, CHI '99.

[21]  Pattie Maes,et al.  Agents with Faces: The Effects of Personification of Agents , 1996 .

[22]  Jean-Claude Martin,et al.  Annotating and Measuring Multimodal Behaviour - Tycoon Metrics in the Anvil Tool , 2002, LREC.

[23]  Justine Cassell,et al.  BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.