Assembling an expressive facial animation system

In this paper we investigate the development of an expressive facial animation system from publicly available components. There is a great body of work on face modeling, facial animation and conversational agents. However, most of the current research either targets a specific aspect of a conversational agent or is tailored to systems that are not publicly available. We propose a high quality facial animation system that can be easily built based on affordable off-the-shelf components. The proposed system is modular, extensible, efficient and suitable for a wide range of applications that require expressive speaking avatars. We demonstrate the effectiveness of the system with two applications: (a) a text-to-speech synthesizer with expression control and (b) a conversational agent that can react to simple phrases.

[1]  Keiichi Tokuda,et al.  Text-to-visual speech synthesis based on parameter generation from HMM , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Christoph Bregler,et al.  Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.

[3]  Frederic I. Parke,et al.  A parametric model for human faces. , 1974 .

[4]  David Salesin,et al.  Synthesizing realistic facial expressions from photographs , 1998, SIGGRAPH.

[5]  Matthew Stone,et al.  Speaking with hands: creating animated conversational characters from recordings of human performance , 2004, ACM Trans. Graph..

[6]  Demetri Terzopoulos,et al.  Realistic modeling for facial animation , 1995, SIGGRAPH.

[7]  Frédéric H. Pighin,et al.  Expressive speech-driven facial animation , 2005, TOGS.

[8]  Mark Steedman,et al.  Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents , 1994, SIGGRAPH.

[9]  Keith Waters,et al.  A muscle model for animation three-dimensional facial expression , 1987, SIGGRAPH.

[10]  Henrique S. Malvar,et al.  Making Faces , 2019, Topoi.

[11]  Matthew Brand,et al.  Voice puppetry , 1999, SIGGRAPH.

[12]  Raymond D. Kent,et al.  Coarticulation in recent speech production models , 1977 .

[13]  Luc Van Gool,et al.  Realistic face animation for speech , 2002, Comput. Animat. Virtual Worlds.

[14]  N. M. Brooke,et al.  Computer graphics animations of talking faces based on stochastic models , 1994, Proceedings of ICSIPNN '94. International Conference on Speech, Image Processing and Neural Networks.

[15]  Hans-Peter Seidel,et al.  "May I talk to you? : -) " - facial animation from text , 2002, 10th Pacific Conference on Computer Graphics and Applications, 2002. Proceedings..

[16]  Victor Ng-Thow-Hing,et al.  Dynamic Animation and Control Environment , 2005, Graphics Interface.

[17]  Payam Saisan,et al.  Modeling and Synthesis of Facial Motion Driven by Speech , 2004, ECCV.

[18]  Jonas Beskow,et al.  Developing a 3D-agent for the august dialogue system , 1999, AVSP.

[19]  Gérard Bailly,et al.  MOTHER: a new generation of talking heads providing a flexible articulatory control for video-realistic speech animation , 2000, INTERSPEECH.

[20]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[21]  Avon Ba Computer Graphics Animations of Talking Faces Based on Stochastic Models , 1994 .

[22]  Mathieu Desbrun,et al.  Learning controls for blend shape based realistic facial animation , 2003, SIGGRAPH '03.

[23]  D. Massaro,et al.  Perceiving Talking Faces , 1995 .

[24]  Michael M. Cohen,et al.  Modeling Coarticulation in Synthetic Visual Speech , 1993 .

[25]  Luc Van Gool,et al.  Speech Animation Using Viseme Space , 2002, VMV.

[26]  Frédéric H. Pighin,et al.  Unsupervised learning for speech motion editing , 2003, SCA '03.

[27]  Tony Ezzat,et al.  Trainable videorealistic speech animation , 2002, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..