Separability of pose and expression in facial tracking and animation

We explore the application of facial tracking to automated re-animation. To this end, it is necessary to recover both head-pose and facial expression from the facial movement of a performer. However, both effects are coupled. This is a serious problem, which previous studies haven't fully considered. The solution to this interaction problem proposed here is to solve explicitly, at each timestep, for pose and expression variables. In principle this is a nonlinear inverse problem. However, appropriate parameterisation of pose in terms of affine transformations with parallax, and of expression in terms of key-frames, reduces the problem to a bilinear one. This can then be solved directly by Singular Value Decomposition. Thus actor-driven animation has ben implemented in real-time, at video field-rate, using two Indy desktop workstations.

[1]  Ronen Basri,et al.  Recognition by Linear Combinations of Models , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  C Tomasi,et al.  Shape and motion from image streams: a factorization method. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Joshua B. Tenenbaum,et al.  Learning bilinear models for two-factor problems in vision , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Michael J. Black,et al.  Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion , 1995, Proceedings of IEEE International Conference on Computer Vision.

[5]  Demetri Terzopoulos,et al.  Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Alex Pentland,et al.  Facial expression recognition using a dynamic model and motion energy , 1995, Proceedings of IEEE International Conference on Computer Vision.

[7]  Paolo Nesi,et al.  Tracking and Synthesizing Facial Motions with Dynamic Contours , 1996, Real Time Imaging.

[8]  Andrew Blake,et al.  Real-Time Lip Tracking for Audio-Visual Speech Recognition Applications , 1996, ECCV.

[9]  J J Koenderink,et al.  Affine structure from motion. , 1991, Journal of the Optical Society of America. A, Optics and image science.

[10]  Timothy F. Cootes,et al.  Active shape models , 1998 .

[11]  Michael Isard,et al.  Learning to Track the Visual Motion of Contours , 1995, Artif. Intell..