A data-driven model for monocular face tracking

This paper describes a two-stage system for 3D tracking of pose and deformation of the human face in monocular image sequences without the use of special markers. The first stage of the system learns the space of all possible facial deformations by applying principal component analysis on real stereo tracking data. The resulting model approximates any generic shape as a linear combination of shape basis vectors. The second stage of the system uses this low-complexity deformable model for simultaneous tracking of pose and deformation of the face from a single image sequence. This stage is known as model-based monocular tracking. There are three main contributions of this paper. First we demonstrate that a data-driven approach for model construction is suitable for tracking non rigid objects and offers an elegant and practical alternative to the task of manual construction of models using 3D scanners or CAD modelers. Second, we show that such a method exhibits good tracking accuracy (errors less than 5 mm) and robustness characteristics. Third, we demonstrate that our system exhibits very promising generalization properties in enabling tracking of multiple persons with the same 3D model.

[1]  Yoshiaki Shirai,et al.  Tracking a person with 3-D motion by integrating optical flow and depth , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[2]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[3]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[4]  Seung-Hong Hong,et al.  Face detection and recognition using PCA , 1999, Proceedings of IEEE. IEEE Region 10 Conference. TENCON 99. 'Multimedia Technology for Asia-Pacific Information Infrastructure' (Cat. No.99CH37030).

[5]  O. Faugeras Three-dimensional computer vision: a geometric viewpoint , 1993 .

[6]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[8]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[9]  Chandra Kambhamettu,et al.  Recovery and tracking of continuous 3D surfaces from stereo data using a deformable dual-mesh , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  Dimitris N. Metaxas,et al.  The integration of optical flow and deformable models with applications to human face shape and motion estimation , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Peter Eisert,et al.  Model-based estimation of facial expression parameters from image sequences , 1997, Proceedings of International Conference on Image Processing.

[12]  Soo-Chang Pei,et al.  Global motion estimation in model-based image coding by tracking three-dimensional contour feature points , 1998, IEEE Trans. Circuits Syst. Video Technol..

[13]  Peter Eisert,et al.  Analyzing Facial Expressions for Virtual Conferencing , 1998, IEEE Computer Graphics and Applications.

[14]  Matthew Stone,et al.  An anthropometric face model using variational techniques , 1998, SIGGRAPH.

[15]  Dimitris N. Metaxas,et al.  Deformable model-based face shape and motion estimation , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[16]  Henning Biermann,et al.  Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).