论文信息 - EAVA: A 3D Emotive Audio-Visual Avatar

EAVA: A 3D Emotive Audio-Visual Avatar

Emotive audio-visual avatars have the potential of significantly improving the quality of Human-Computer Interaction (HCI). In this paper, the various technical approaches of a novel framework leading to a text-driven 3D Emotive Audio-Visual Avatar (EAVA) are proposed. Primary work is focused on 3D face modeling, realistic emotional facial expression animation, emotive speech synthesis, and the co-articulation of speech gestures (i.e., lip movements due to speech production) and facial expressions. Experimental results clearly indicate that a certain degree of naturalness and expressiveness has been achieved by EAVA in both audio and visual aspects. Promising potential improvements can be expected by incorporating various data-driven statistical learning models into the framework.

Yun Fu | Thomas S. Huang | Mark Hasegawa-Johnson | Hao Tang | Jilin Tu

[1] Thomas S. Huang,et al. 3D Face Processing: Modeling, Analysis and Synthesis , 2004 .

[2] Marc Schröder,et al. Expressing vocal effort in concatenative synthesis , 2003 .

[3] Thomas S. Huang,et al. Real-time speech-driven face animation with expressions using neural networks , 2002, IEEE Trans. Neural Networks.

[4] Felix Burkhardt,et al. Emofilt: the simulation of emotional speech by prosody-transformation , 2005, INTERSPEECH.

[5] Gregor Hofer,et al. Emotional Speech Synthesis , 2004 .

[6] Javier Macías Guarasa,et al. Development of an emotional speech synthesiser in Spanish , 1999, EUROSPEECH.

[7] Michael Picheny,et al. The IBM expressive text-to-speech synthesis system for American English , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[8] Thomas S. Huang,et al. iFACE: A 3D Synthetic Talking Face , 2001, Int. J. Image Graph..

[9] Janet E. Cahn. Generating expression in synthesized speech , 1989 .

[10] 飯田朱美. A study on corpus-based speech synthesis with emotion , 2002 .

[11] Felix Burkhardt,et al. Simulation emotionaler Sprechweise mit Sprachsyntheseverfahren , 2000 .

[12] Frédéric H. Pighin,et al. Expressive speech-driven facial animation , 2005, TOGS.

[13] Nadia Magnenat-Thalmann,et al. Principal components of expressive speech animation , 2001, Proceedings. Computer Graphics International 2001.