[HUGE]: Universal Architecture for Statistically Based HUman GEsturing

We introduce a universal architecture for statistically based HUman GEsturing (HUGE) system, for producing and using statistical models for facial gestures based on any kind of inducement. As inducement we consider any kind of signal that occurs in parallel to the production of gestures in human behaviour and that may have a statistical correlation with the occurrence of gestures, e.g. text that is spoken, audio signal of speech, bio signals etc. The correlation between the inducement signal and the gestures is used to first build the statistical model of gestures based on a training corpus consisting of sequences of gestures and corresponding inducement data sequences. In the runtime phase, the raw, previously unknown inducement data is used to trigger (induce) the real time gestures of the agent based on the previously constructed statistical model. We present the general architecture and implementation issues of our system, and further clarify it through two case studies. We believe that this universal architecture is useful for experimenting with various kinds of potential inducement signals and their features and exploring the correlation of such signals or features with the gesturing behaviour.

[1]  Norman I. Badler,et al.  Eyes alive , 2002, ACM Trans. Graph..

[2]  G. Zoric,et al.  Automatic facial gesturing for conversational agents and avatars , 2005, Proceedings of the 2005 International Conference on Active Media Technology, 2005. (AMT 2005)..

[3]  Igor S. Pandzic,et al.  A Real-Time Lip SYNC System Using a Genetic Algorithm for Automatic Neural Network Configuration , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[4]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[5]  Frédéric H. Pighin,et al.  Expressive speech-driven facial animation , 2005, TOGS.

[6]  Björn Granström,et al.  Audiovisual representation of prosody in expressive speech communication , 2004, Speech Commun..

[7]  Matthew Brand,et al.  Voice puppetry , 1999, SIGGRAPH.

[8]  Irene Albrecht,et al.  Automatic Generation of Non-Verbal Facial Expressions from Speech , 2002 .

[9]  Igor S. Pandžić,et al.  Automatic Content Production for an Autonomous Speaker Agent , 2005 .

[10]  Justine Cassell,et al.  BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.

[11]  Catherine Pelachaud,et al.  Signals and meanings of gaze in animated faces , 2002 .

[12]  Ricardo Gutierrez-Osuna,et al.  Speech-driven facial animation with realistic dynamics , 2005, IEEE Transactions on Multimedia.

[13]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[14]  Volker Strom,et al.  Visual prosody: facial movements accompanying speech , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.