Cooperative patch-based 3D surface tracking

This paper presents a novel dense motion capture technique which creates a temporally consistent mesh sequence from several calibrated and synchronised video sequences of a dynamic object. A surface patch model based on the topology of a user-specified reference mesh is employed to track the surface of the object over time. Multi-view 3D matching of surface patches using a novel cooperative minimisation approach provides initial motion estimates which are robust to large, rapid non-rigid changes of shape. A Laplacian deformation subsequently regularises the motion of the whole mesh using the weighted vertex displacements as soft constraints. An unregistered surface geometry independently reconstructed at each frame is incorporated as a shape prior to improve the quality of tracking. The method is evaluated in a challenging scenario of facial performance capture. Results demonstrate accurate tracking of fast, complex expressions over long sequences without use of markers or a pattern.

[1]  Olivier D. Faugeras,et al.  Multi-View Stereo Reconstruction and Scene Flow Estimation with a Global Image-Based Matching Score , 2007, International Journal of Computer Vision.

[2]  Mark Meyer,et al.  Discrete Differential-Geometry Operators for Triangulated 2-Manifolds , 2002, VisMath.

[3]  Y. Aloimonos,et al.  Spatio-Temporal Stereo Using Multi-Resolution Subdivision Surfaces , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[4]  Frederic Devernay,et al.  Multi-Camera Scene Flow by Tracking 3-D Points and Surfels , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Ye Zhang,et al.  Integrated 3D scene flow and structure recovery from multiview image sequences , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Takeo Kanade,et al.  Three-dimensional scene flow , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Daniel Cremers,et al.  Efficient Dense Scene Flow from Sparse or Dense Stereo Data , 2008, ECCV.

[8]  Sébastien Roy,et al.  Stereo Without Epipolar Lines: A Maximum-Flow Formulation , 1999, International Journal of Computer Vision.

[9]  Björn Stenger,et al.  Non-rigid Photometric Stereo with Colored Lights , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Kiriakos N. Kutulakos,et al.  Multi-view scene capture by surfel sampling: from video streams to non-rigid 3D motion, shape and reflectance , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[11]  Jean-Philippe Pons,et al.  Dense and Accurate Spatio-temporal Multi-view Stereovision , 2009, ACCV.

[12]  Kwanghoon Sohn,et al.  Edge-preserving Simultaneous Joint Motion-Disparity Estimation , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[13]  Steven M. Seitz,et al.  Spacetime faces , 2004, ACM Trans. Graph..

[14]  Jean Ponce,et al.  Dense 3D motion capture from synchronized video streams , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[16]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Christian Rössl,et al.  Laplacian surface editing , 2004, SGP '04.

[18]  Frederic Devernay,et al.  A Variational Method for Scene Flow Estimation from Stereo Sequences , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[19]  Jean Ponce,et al.  Dense 3D motion capture for human faces , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Takeo Kanade,et al.  Image-based spatio-temporal modeling and view interpolation of dynamic events , 2005, TOGS.

[21]  Adam Finkelstein,et al.  The Generalized PatchMatch Correspondence Algorithm , 2010, ECCV.

[22]  Rui Li,et al.  Multi-Scale 3D Scene Flow from Binocular Stereo Sequences , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[23]  Derek Bradley,et al.  High resolution passive facial performance capture , 2010, ACM Trans. Graph..