Towards a Common Framework for Multimodal Generation: The Behavior Markup Language

This paper describes an international effort to unify a multimodal behavior generation framework for Embodied Conversational Agents (ECAs). We propose a three stage model we call SAIBA where the stages represent intent planning, behavior planning and behavior realization. A Function Markup Language (FML), describing intent without referring to physical behavior, mediates between the first two stages and a Behavior Markup Language (BML) describing desired physical realization, mediates between the last two stages. In this paper we will focus on BML. The hope is that this abstraction and modularization will help ECA researchers pool their resources to build more sophisticated virtual humans.

[1]  Kristinn R. Thórisson,et al.  Dialogue control in social interface agents , 1993, INTERCHI Adjunct Proceedings.

[2]  Stefan Kopp,et al.  Synthesizing multimodal utterances for conversational agents , 2004, Comput. Animat. Virtual Worlds.

[3]  Craig Martell FORM: An Extensible, Kinematically-based Gesture Annotation Scheme , 2002, LREC.

[4]  Hannes Högni Vilhjálmsson Animating Conversation in Online Games , 2004, ICEC.

[5]  Justine Cassell,et al.  Requirements for an Architecture for Embodied Conversational Characters , 1999, Computer Animation and Simulation.

[6]  Kristinn R. Thrisson,et al.  Computational Characteristics of Multimodal Dialogue , 1998 .

[7]  Brigitte Krenn,et al.  RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA , 2004, ArXiv.

[8]  Brigitte Krenn,et al.  Defining the Gesticon : Language and Gesture Coordination for Interacting Embodied Agents , 2004 .

[9]  W. Stokoe,et al.  A dictionary of American sign language on linguistic principles , 1965 .

[10]  Justine Cassell,et al.  BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.

[11]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[12]  Kristinn R. Thórisson,et al.  Mind Model for Multimodal Communicative Creatures and Humanoids , 1999, Appl. Artif. Intell..

[13]  Hannes Högni Vilhjálmsson,et al.  Augmenting Online Conversation through Automated Discourse Tagging , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[14]  Mitsuru Ishizuka,et al.  Journal of Visual Languages , 2002 .

[15]  Stefan Kopp,et al.  Report on Representations for Multimodal Generation Workshop , 2006 .

[16]  Maurizio Mancini,et al.  Formational parameters and adaptive prototype instantiation for MPEG-4 compliant gesture synthesis , 2002, Proceedings of Computer Animation 2002 (CA 2002).

[17]  John R. Searle,et al.  Speech Acts: An Essay in the Philosophy of Language , 1970 .

[18]  Mark Steedman,et al.  APML, a Markup Language for Believable Behavior Generation , 2004, Life-like characters.

[19]  Ipke Wachsmuth,et al.  Max - A Multimodal Assistant in Virtual Reality Construction , 2003, Künstliche Intell..

[20]  Stefan Kopp,et al.  Synthesizing multimodal utterances for conversational agents: Research Articles , 2004 .