Animated Heads: From 3D Motion Fields to Action Descriptions

We demonstrate a method to compute three-dimensional (3D) motion fields on a face. Twelve synchronized and calibrated cameras are po­ sitioned around a talking person, and observe its head in motion. We represent the head as a deformable mesh, which is fitted in aglobai optimization step to silhouette-contour and multi-camera stereo data derived from all images. The non-rigid displacement of the mesh from frame to frame, the 3D motion field, is determined from the spatio­ temporal derivatives in all the images. We integrate these cues over time, thus producing an animated representation of the talking head. Our ability to estimate 3D motion fields points to a new framework for the study of action. The 3D motion fields can serve as an intermediate representation, which can be analyzed using geometrical and statisti­ cal tools for the purpose of extracting representations of generic human actions.

[1]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[2]  Takeo Kanade,et al.  Shape and motion carving in 6D , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[3]  Robert Pless,et al.  The Ouchi illusion as an artifact of biased flow estimation , 2000, Vision Research.

[4]  Andrew Blake,et al.  Learning Dynamics of Complex Motions from Image Sequences , 1996, ECCV.

[5]  Thomas Vetter,et al.  Estimating Coloured 3D Face Models from Single Images: An Example Based Approach , 1998, ECCV.

[6]  Michael G. Strintzis,et al.  Model-Based Joint Motion and Structure Estimation from Stereo Images , 1997, Comput. Vis. Image Underst..

[7]  Pascal Fua,et al.  Animated Heads from Ordinary Images: A Least-Squares Approach , 1999, Comput. Vis. Image Underst..

[8]  Takeo Kanade,et al.  Three-dimensional scene flow , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  Andrew Blake,et al.  Separability of pose and expression in facial tracking and animation , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[10]  Larry S. Davis,et al.  Multi-perspective analysis of human action , 1999 .

[11]  Bernd Jähne,et al.  Dense range flow from depth and intensity data , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[12]  Alex Pentland,et al.  Coding, Analysis, Interpretation, and Recognition of Facial Expressions , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Ye Zhang,et al.  Integrated 3D scene flow and structure recovery from multiview image sequences , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[14]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[15]  Jake K. Aggarwal,et al.  Human motion analysis: a review , 1997, Proceedings IEEE Nonrigid and Articulated Motion Workshop.

[16]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[17]  Demetri Terzopoulos,et al.  Analysis and Synthesis of Facial Image Sequences Using Physical and Anatomical Models , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..