Recovering non-rigid 3D shape from image streams

The paper addresses the problem of recovering 3D non-rigid shape models from image sequences. For example, given a video recording of a talking person, we would like to estimate a 3D model of the lips and the full face and its internal modes of variation. Many solutions that recover 3D shape from 2D image sequences have been proposed; these so-called structure-from-motion techniques usually assume that the 3D object is rigid. For example, C. Tomasi and T. Kanades' (1992) factorization technique is based on a rigid shape matrix, which produces a tracking matrix of rank 3 under orthographic projection. We propose a novel technique based on a non-rigid model, where the 3D shape in each frame is a linear combination of a set of basis shapes. Under this model, the tracking matrix is of higher rank, and can be factored in a three-step process to yield pose, configuration and shape. To the best of our knowledge, this is the first model free approach that can recover from single-view video sequences nonrigid shape models. We demonstrate this new algorithm on several video sequences. We were able to recover 3D non-rigid human face and animal models with high accuracy.

[1]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[2]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Takeo Kanade,et al.  A multi-body factorization method for motion analysis , 1995, Proceedings of IEEE International Conference on Computer Vision.

[4]  Timothy F. Cootes,et al.  Automatic interpretation of human faces and hand gestures using flexible models. , 1995 .

[5]  Michael Isard,et al.  Learning to Track the Visual Motion of Contours , 1995, Artif. Intell..

[6]  Andrew Blake,et al.  Separability of pose and expression in facial tracking and animation , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[7]  Dimitris N. Metaxas,et al.  Deformable model-based shape and motion analysis from images using motion residual error , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[8]  Henrique S. Malvar,et al.  Making Faces , 2019, Topoi.

[9]  David Salesin,et al.  Resynthesizing facial animation through 3D model-based tracking , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[11]  David W. Jacobs,et al.  Linear Fitting with Missing Data for Structure-from-Motion , 2001, Comput. Vis. Image Underst..

[12]  Frédéric H. Pighin,et al.  Synthesizing realistic facial expressions from photographs , 1998, SIGGRAPH Courses.