BEAT: the Behavior Expression Animation Toolkit

The Behavior Expression Animation Toolkit (BEAT) allows animators to input typed text that they wish to be spoken by an animated human figure, and to obtain as output appropriate and synchronized nonverbal behaviors and synthesized speech in a form that can be sent to a number of different animation systems. The nonverbal behaviors are assigned on the basis of actual linguistic and contextual analysis of the typed text, relying on rules derived from extensive research into human conversational behavior. The toolkit is extensible, so that new rules can be quickly added. It is designed to plug into larger systems that may also assign personality profiles, motion characteristics, scene constraints, or the animation styles of particular animators.

[1]  Michael Halliday,et al.  Explorations in the functions of language , 1973 .

[2]  Brian Wyvill,et al.  Speech and expression: a computer solution to face animation , 1986 .

[3]  斉藤 康己,et al.  Douglas B. Lenat and R. V. Guha : Building Large Knowledge-Based Systems, Representation and Inference in the Cyc Project, Addison-Wesley (1990). , 1990 .

[4]  Julia Hirschberg,et al.  Accent and Discourse Context: Assigning Pitch Accent in Synthetic Speech , 1990, AAAI.

[5]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[6]  Ramanathan V. Guha,et al.  Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project , 1990 .

[7]  Thomas W. Calvert,et al.  Composition of realistic animation sequences for multiple human figures , 1991 .

[8]  D. McNeill Hand and Mind: What Gestures Reveal about Thought , 1992 .

[9]  Akikazu Takeuchi,et al.  Speech Dialogue With Facial Displays: Multimodal Human-Computer Conversation , 1994, ACL.

[10]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[11]  Thoms M. Levergood,et al.  DEC face: an automatic lip-synchronization algorithm for synthetic faces , 1993 .

[12]  Mark Steedman,et al.  Specifying intonation from context for speech synthesis , 1994, Speech Communication.

[13]  Speech dialogue with facial displays , 1994, CHI '94.

[14]  M. Studdert-Kennedy Hand and Mind: What Gestures Reveal About Thought. , 1994 .

[15]  Bruce Blumberg,et al.  Multi-level direction of autonomous creatures for real-time virtual environments , 1995, SIGGRAPH.

[16]  Kenji Amaya,et al.  Emotion from Motion , 1996, Graphics Interface.

[17]  Mark Steedman,et al.  Generating Facial Expressions for Speech , 1996, Cogn. Sci..

[18]  Alex Acero,et al.  Whistler: a trainable text-to-speech system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[19]  Ken Perlin,et al.  Improv: a system for scripting interactive actors in virtual worlds , 1996, SIGGRAPH.

[20]  David Salesin,et al.  Comic Chat , 1996, SIGGRAPH.

[21]  D. Massaro Perceiving talking faces: from speech perception to a behavioral principle , 1999 .

[22]  Christoph Bregler,et al.  Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.

[23]  Justine Cassell,et al.  Semantic and Discourse Information for Text-to-Speech Intonation , 1997, Workshop On Concept To Speech Generation Systems.

[24]  Paul Taylor,et al.  The architecture of the Festival speech synthesis system , 1998, SSW.

[25]  Joey Chang Action scheduling in humanoid conversational agents , 1998 .

[26]  D. Thalmann,et al.  A behavioral animation system for autonomous actors personified by emotions , 1998 .

[27]  J. Cassell,et al.  Turn taking vs. Discourse Structure: How Best to Model Multimodal Conversation , 1998 .

[28]  Norman I. Badler,et al.  A Parameterized Action Representation for Virtual Human Agents , 1998 .

[29]  Michael F. Cohen,et al.  Verbs and Adverbs: Multidimensional Motion Interpolation , 1998, IEEE Computer Graphics and Applications.

[30]  Matthew Brand,et al.  Voice puppetry , 1999, SIGGRAPH.

[31]  J. Cassell,et al.  Turn Taking versus Discourse Structure , 1999 .

[32]  S. Drucker,et al.  The Role of Eye Gaze in Avatar Mediated Conversational Interfaces , 2000 .

[33]  Hao Yan Paired Speech and Gesture Generation in Embodied Conversational Agents , 2000 .

[34]  Hao Yan,et al.  Paired speech and gesture generation in embodied conversational agents , 2000 .

[35]  Norman I. Badler,et al.  The EMOTE model for effort and shape , 2000, SIGGRAPH.

[36]  J. Cassell,et al.  Nudge nudge wink wink: elements of face-to-face conversation for embodied conversational agents , 2001 .

[37]  Ken Perlin Noise, hypertexture, antialiasing, and gesture , 2003 .