论文信息 - Modelling of emotional facial expressions during speech in synthetic talking heads using a hybrid approach

Modelling of emotional facial expressions during speech in synthetic talking heads using a hybrid approach

Talking heads and virtual characters, able to commu nicate complex information with human-like expressiveness and naturalness, should be able to display emotional fa cial expressions. In recent years several works, both ru le-based and statistical, have obtained important results in the modelling of emotional facial expressions to be use d in synthetic Talking Heads. However, most of the rulebased systems suffer from the drawback of static generati on, due to the fact that the set of rules and their combinatio ns are limited. On the other hand, most of works on synthe sis using statistical approaches are restricted to speech and lip movements. This paper presents a modelling of the dynamics of emotional facial expressions based on a hybrid statistical/machine learning approach. This approac h combines Hidden Markov Models (HMMs) and Recurrent Neural Networks (RNNs), aiming at benefiting from t he advantages of both paradigms and overcoming their o wn limitations.

Nadia Mana | Fabio Pianesi | N. Mana | F. Pianesi

[1] Nadia Mana,et al. HMM-based synthesis of emotional facial expressions during speech in synthetic talking heads , 2006, ICMI '06.

[2] Shane S. Sturrock,et al. Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .

[3] C. Izard. Emotions and facial expressions: A perspective from Differential Emotions Theory. , 1997 .

[4] A N Gilbert,et al. Emotions and facial expression. , 1985, Science.

[5] Piero Cosi,et al. Italian consonantal visemes: relationships between spatial/ temporal articulatory characteristics and coproduced acoustic signal , 1997, AVSP.

[6] Fabio Lavagetto,et al. MPEG-4: Audio/video and synthetic graphics/audio for mixed media , 1997, Signal Process. Image Commun..

[7] P. Ekman. An argument for basic emotions , 1992 .

[8] Justine Cassell,et al. Human conversation as a system framework: designing embodied conversational agents , 2001 .

[9] Peter Bull,et al. Body movement and emphasis in speech , 1985 .

[10] P. Ekman,et al. Unmasking the Face: A Guide to Recognizing Emotions From Facial Expressions , 1975 .

[11] P. De Silva,et al. Handbook of Cognition and Emotion , 2001 .

[12] Koray Balci. Xface: Open Source Toolkit for Creating 3D Faces of an Embodied Conversational Agent , 2005, Smart Graphics.

[13] F. Pianesi,et al. An Italian Database of Emotional Speech and Facial Expressions , 2006 .

[14] Giancarlo Ferrigno,et al. Elite: A Digital Dedicated Hardware System for Movement Analysis Via Real-Time TV Signal Processing , 1985, IEEE Transactions on Biomedical Engineering.