Visual Estimation and Compression of Facial Motion Parameters—Elements of a 3D Model-Based Video Coding System

The MPEG4 standard supports the transmission and composition of facial animation with natural video by including a facial animation parameter (FAP) set that is defined based on the study of minimal facial actions and is closely related to muscle actions. The FAP set enables model-based representation of natural or synthetic talking head sequences and allows intelligible visual reproduction of facial expressions, emotions, and speech pronunciations at the receiver. This paper describes two key components we have developed for building a model-based video coding system: (1) a method for estimating FAP parameters based on our previously proposed piecewise Bézier volume deformation model (PBVD), and (2) various methods for encoding FAP parameters. PBVD is a linear deformation model suitable for both the synthesis and the analysis of facial images. Each FAP parameter is a basis function in this model. Experimental results on PBVD-based animation, model-based tracking, and spatial-temporal compression of FAP parameters are demonstrated in this paper.

[1]  Thomas W. Sederberg,et al.  Free-form deformation of solid geometric models , 1986, SIGGRAPH.

[2]  Lance Williams,et al.  Performance-driven facial animation , 1990, SIGGRAPH.

[3]  Daniel Thalmann,et al.  Simulation of Facial Muscle Actions Based on Rational Free Form Deformations , 1992, Comput. Graph. Forum.

[4]  Alex Pentland,et al.  Coding, Analysis, Interpretation, and Recognition of Facial Expressions , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Demetri Terzopoulos,et al.  Realistic modeling for facial animation , 1995, SIGGRAPH.

[6]  Dimitris N. Metaxas,et al.  The integration of optical flow and deformable models with applications to human face shape and motion estimation , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Parke,et al.  Parameterized Models for Facial Animation , 1982, IEEE Computer Graphics and Applications.

[8]  Thomas S. Huang,et al.  Explanation-based facial motion tracking using a piecewise Bezier volume deformation model , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  Wei Wu,et al.  Compression of MPEG-4 facial animation parameters for transmission of talking heads , 1999, IEEE Trans. Circuits Syst. Video Technol..

[11]  Anil K. Jain Fundamentals of Digital Image Processing , 2018, Control of Color Imaging Systems.

[12]  Lawrence Sirovich,et al.  Management and Analysis of Large Scientific Datasets , 1992 .

[13]  Kiyoharu Aizawa,et al.  Analysis and synthesis of facial image sequences in model-based image coding , 1994, IEEE Trans. Circuits Syst. Video Technol..

[14]  Pertti Roivainen,et al.  3-D Motion Estimation in Model-Based Facial Image Coding , 1993, IEEE Trans. Pattern Anal. Mach. Intell..