Animating visible speech and facial expressions

We present four techniques for modeling and animating faces starting from a set of morph targets. The first technique involves obtaining parameters to control individual facial components and learning the mapping from one type of parameter to another through machine learning techniques. The second technique is to fuse visible speech and facial expressions in the lower part of a face. The third technique combines coarticulation rules and kernel smoothing techniques. Finally, a new 3D tongue model with flexible and intuitive skeleton controls is presented. The results of eight animated character models demonstrate that these techniques are powerful and effective.

[1]  Justine Cassell,et al.  BEAT: the Behavior Expression Animation Toolkit , 2001, Life-like characters.

[2]  Markus Gross,et al.  Simulating facial surgery using finite element models , 1996 .

[3]  M. Stone,et al.  Three-dimensional tongue surface shapes of English consonants and vowels. , 1996, The Journal of the Acoustical Society of America.

[4]  Scott A. King,et al.  A 3D parametric tongue model for animated speech , 2001, Comput. Animat. Virtual Worlds.

[5]  Jacques de Villiers,et al.  New tools for interactive speech and language training: Using animated conversational agents in the classrooms of profoundly deaf children , 1999 .

[6]  Yohan Payan,et al.  A control model of human tongue movements in speech , 1997, Biological Cybernetics.

[7]  Pierre Poulin,et al.  Real-time facial animation based upon a bank of 3D facial expressions , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).

[8]  Nadia Magnenat-Thalmann,et al.  Lip synchronization using linear predictive analysis , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[9]  R. Eubank Nonparametric Regression and Spline Smoothing , 1999 .

[10]  Larry H. Small Fundamentals of Phonetics: A Practical Guide for Students , 1998 .

[11]  Nadia Magnenat-Thalmann,et al.  Dirichlet free-form deformations and their application to hand simulation , 1997, Proceedings. Computer Animation '97 (Cat. No.97TB100120).

[12]  Kadri Hacioglu,et al.  Recent improvements in the CU Sonic ASR system for noisy speech: the SPINE task , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[13]  Jun-yong Noh,et al.  Expression cloning , 2001, SIGGRAPH 2001.

[14]  Hans-Peter Seidel,et al.  Speech Synchronization for Physics-Based Facial Animation , 2002, WSCG.

[15]  Nadia Magnenat-Thalmann,et al.  Principal components of expressive speech animation , 2001, Proceedings. Computer Graphics International 2001.

[16]  G. Plant Perceiving Talking Faces: From Speech Perception to a Behavioral Principle , 1999 .

[17]  Michael M. Cohen,et al.  Modeling Coarticulation in Synthetic Visual Speech , 1993 .

[18]  A. P. Breen,et al.  An investigation into the generation of mouth shapes for a talking head , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[19]  Anders Löfqvist,et al.  Speech as Audible Gestures , 1990 .

[20]  S. Öhman Coarticulation in VCV Utterances: Spectrographic Measurements , 1966 .

[21]  George Celniker,et al.  Deformable curve and surface finite-elements for free-form shape design , 1991, SIGGRAPH.

[22]  David Salesin,et al.  Modeling and Animating Realistic Faces from Images , 2002, International Journal of Computer Vision.

[23]  Algirdas Pakstas,et al.  MPEG-4 Facial Animation: The Standard,Implementation and Applications , 2002 .

[24]  F. I. Parke June,et al.  Computer Generated Animation of Faces , 1972 .

[25]  Norman I. Badler,et al.  Animating facial expressions , 1981, SIGGRAPH '81.

[26]  Gérard Bailly,et al.  A three-dimensional linear articulatory model based on MRI data , 1998, ICSLP.

[27]  Tomaso A. Poggio,et al.  Linear Object Classes and Image Synthesis From a Single Example Image , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Raymond D. Kent The Speech Sciences , 1997 .

[29]  Raymond D. Kent,et al.  Coarticulation in recent speech production models , 1977 .

[30]  Daniel Thalmann,et al.  Abstract muscle action procedures for human face animation , 1988, The Visual Computer.

[31]  William H. Press,et al.  Numerical recipes in C , 2002 .

[32]  Christoph Bregler,et al.  Video Rewrite: Driving Visual Speech with Audio , 1997, SIGGRAPH.

[33]  Barr,et al.  Superquadrics and Angle-Preserving Transformations , 1981, IEEE Computer Graphics and Applications.

[34]  Ronald A. Cole,et al.  CU animate tools for enabling conversations with animated characters , 2002, INTERSPEECH.

[35]  Olov Engwall A 3d tongue model based on MRI data , 2000, INTERSPEECH.

[36]  Gerald Farin,et al.  Curves and surfaces for cagd , 1992 .

[37]  Alex Pentland,et al.  Modal Matching for Correspondence and Recognition , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Tomaso Poggio,et al.  Trainable Videorealistic Speech Animation , 2004, FGR.

[39]  George Maestri,et al.  Digital character animation , 1996 .

[40]  Scott A. King,et al.  A Parametric Tongue Model for Animated Speech , 2000, Computer Animation and Simulation.

[41]  Demetri Terzopoulos,et al.  Realistic modeling for facial animation , 1995, SIGGRAPH.

[42]  C. Creider Hand and Mind: What Gestures Reveal about Thought , 1994 .

[43]  Frederick I. Parke,et al.  Computer generated animation of faces , 1972, ACM Annual Conference.

[44]  N. Badler,et al.  Linguistic Issues in Facial Animation , 1991 .

[45]  K. Munhall,et al.  Coarticulation: Theory, Data, and Techniques , 2001 .

[46]  Janet Beavin Bavelas,et al.  Gestures as Part of Speech: Methodological Implications , 1994 .

[47]  Demetri Terzopoulos,et al.  Physically-based facial modelling, analysis, and animation , 1990, Comput. Animat. Virtual Worlds.

[48]  Matthew Brand,et al.  Voice puppetry , 1999, SIGGRAPH.