Markerless Shape and Motion Capture From Multiview Video Sequences

We propose a new markerless shape and motion capture approach from multiview video sequences. The shape recovery method consists of two steps: separating and merging. In the separating step, the depth map represented with a point cloud for each view is generated by solving a proposed variational model, which is regularized by four constraints to ensure the accuracy and completeness of the reconstruction. Then, in the merging step, the point clouds of all the views are merged together and reconstructed into a 3-D mesh using a marching cubes method with silhouette constraints. Experiments show that the geometric details are faithfully preserved in each estimated depth map. The 3-D meshes reconstructed from the estimated depth maps are watertight and present rich geometric details, even for non-convex objects. Taking the reconstructed 3-D mesh as the underlying scene representation, a volumetric deformation method with a new positional-constraint computation scheme is proposed to automatically capture motions of the 3-D object. Our method can capture non-rigid motions even for loosely dressed humans without the aid of markers.

[1]  A. Laurentini,et al.  The Visual Hull Concept for Silhouette-Based Image Understanding , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, ACM Trans. Graph..

[3]  Qionghai Dai,et al.  Noisy Depth Maps Fusion for Multiview Stereo Via Matrix Completion , 2012, IEEE Journal of Selected Topics in Signal Processing.

[4]  Soheil Darabi,et al.  Compressive Dual Photography , 2009, Comput. Graph. Forum.

[5]  Daniel Cremers,et al.  Continuous Global Optimization in Multiview 3D Reconstruction , 2007, International Journal of Computer Vision.

[6]  Pau Gargallo,et al.  Bayesian 3D modeling from images using multiple depth maps , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Rui Li,et al.  Multi-Scale 3D Scene Flow from Binocular Stereo Sequences , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[8]  Hans-Peter Seidel,et al.  Cloth X-Ray: MoCap of People Wearing Textiles , 2006, DAGM-Symposium.

[9]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[10]  Alberto Menache,et al.  Understanding Motion Capture for Computer Animation and Video Games , 1999 .

[11]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Long Quan,et al.  Progressive surface reconstruction from images using a local prior , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[13]  Richard Szeliski,et al.  Handling occlusions in dense multi-view stereo , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[14]  Klaus Gärtner,et al.  Meshing Piecewise Linear Complexes by Constrained Delaunay Tetrahedralizations , 2005, IMR.

[15]  Daniel Cremers,et al.  Integration of Multiview Stereo and Silhouettes Via Convex Functionals on Convex Domains , 2008, ECCV.

[16]  Timo Kohlberger,et al.  Variational optical flow computation in real time , 2005, IEEE Transactions on Image Processing.

[17]  Bodo Rosenhahn,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Combined Region-and Motion-based 3d Tracking of Rigid and Articulated Objects , 2022 .

[18]  Michael Goesele,et al.  Multi-View Stereo Revisited , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[20]  Kun Li,et al.  High quality color calibration for multi-camera systems with an omnidirectional color checker , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Olivier D. Faugeras,et al.  Variational principles, surface evolution, PDEs, level set methods, and the stereo problem , 1998, IEEE Trans. Image Process..

[22]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[23]  Jovan Popović,et al.  Deformation transfer for triangle meshes , 2004, SIGGRAPH 2004.

[24]  Takeo Kanade,et al.  Three-dimensional scene flow , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Radu Horaud,et al.  TransforMesh : A Topology-Adaptive Mesh-Based Approach to Surface Evolution , 2007, ACCV.

[26]  Hans-Peter Seidel,et al.  A volumetric approach to interactive shape editing , 2007 .

[27]  Stephen P. Boyd,et al.  Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices , 2003, Proceedings of the 2003 American Control Conference, 2003..

[28]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Olga Sorkine,et al.  Laplacian Mesh Processing , 2005 .

[30]  Jovan Popovic,et al.  Continuous capture of skin deformation , 2003, ACM Trans. Graph..

[31]  Wojciech Matusik,et al.  Articulated mesh animation from multi-view silhouettes , 2008, ACM Trans. Graph..

[32]  Sébastien Roy,et al.  Geo-consistency for wide multi-camera stereo , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[33]  David J. Kriegman,et al.  Shape from Varying Illumination and Viewpoint , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[34]  Luc Van Gool,et al.  Blue-c: a spatially immersive display and 3D video portal for telepresence , 2003, IPT/EGVE.

[35]  Hans-Peter Seidel,et al.  Motion capture using joint skeleton tracking and surface estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  O. Faugeras,et al.  Variational principles, surface evolution, PDE's, level set methods and the stereo problem , 1998, 5th IEEE EMBS International Summer School on Biomedical Imaging, 2002..

[37]  Frederic Devernay,et al.  A Variational Method for Scene Flow Estimation from Stereo Sequences , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[38]  Hans-Peter Seidel,et al.  Performance capture from sparse multi-view video , 2008, SIGGRAPH 2008.

[39]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[40]  Emmanuel J. Candès,et al.  Matrix Completion With Noise , 2009, Proceedings of the IEEE.

[41]  Zoran Popovic,et al.  Articulated body deformation from range scan data , 2002, SIGGRAPH.

[42]  Kun Zhou,et al.  Gradient domain editing of deforming mesh sequences , 2007, ACM Trans. Graph..

[43]  Hans-Peter Seidel,et al.  Drift-free tracking of rigid and articulated objects , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Adrian Hilton,et al.  Volumetric Stereo with Silhouette and Feature Constraints , 2006, BMVC.

[45]  Hang Si,et al.  On Refinement of Constrained Delaunay Tetrahedralizations , 2006, IMR.

[46]  Marc Alexa,et al.  As-rigid-as-possible surface modeling , 2007, Symposium on Geometry Processing.

[47]  Ping-Sing Tsai,et al.  Shape from Shading: A Survey , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Hans-Peter Seidel,et al.  Marker-less Deformable Mesh Tracking for Human Shape and Motion Capture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Hans-Peter Seidel,et al.  Scaled Motion Dynamics for Markerless Motion Capture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Tsuhan Chen,et al.  The painful face - Pain expression recognition using active appearance models , 2009, Image Vis. Comput..

[51]  Volker Scholz,et al.  Garment motion capture using color-coded patterns , 2005, SIGGRAPH '05.

[52]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[53]  Derek Bradley,et al.  Accurate multi-view reconstruction using robust binocular stereo and surface meshing , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Daniel Cremers,et al.  Efficient Dense Scene Flow from Sparse or Dense Stereo Data , 2008, ECCV.

[55]  Daniel Cremers,et al.  3-D Reconstruction of Shaded Objects from Multiple Images Under Unknown Illumination , 2008, International Journal of Computer Vision.

[56]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[57]  Olga Sorkine-Hornung,et al.  On Linear Variational Surface Deformation Methods , 2008, IEEE Transactions on Visualization and Computer Graphics.

[58]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[59]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[60]  Olivier D. Faugeras,et al.  Modelling dynamic scenes by registering multi-view image sequences , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[61]  Hyun-Cheol Lee,et al.  3D Face Deformation Using Control Points and Vector Muscles , 2007 .

[62]  Tao Ju,et al.  Dual contouring of hermite data , 2002, ACM Trans. Graph..

[63]  Anders P. Eriksson,et al.  Efficient computation of robust low-rank matrix approximations in the presence of missing data using the L1 norm , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[64]  Edmond Boyer,et al.  Exact polyhedral visual hulls , 2003, BMVC.

[65]  Xiaojun Wu,et al.  Real-time dynamic 3-D object shape reconstruction and high-fidelity texture mapping for 3-D video , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[66]  Laurent D. Cohen,et al.  Image Registration, Optical Flow and Local Rigidity , 2001, Journal of Mathematical Imaging and Vision.

[67]  E LorensenWilliam,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987 .

[68]  Adrian Hilton,et al.  Surface Capture for Performance-Based Animation , 2007, IEEE Computer Graphics and Applications.

[69]  Francis Schmitt,et al.  Silhouette and stereo fusion for 3D object modeling , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[70]  Bobby Bodenheimer,et al.  Synthesis and evaluation of linear motion transitions , 2008, TOGS.

[71]  Daniel Snow,et al.  Shape and albedo from multiple images using integrability , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[72]  Roberto Cipolla,et al.  Multiview Stereo via Volumetric Graph-Cuts and Occlusion Robust Photo-Consistency , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[73]  Horst Bischof,et al.  A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.

[74]  Andrew Blake,et al.  Visual Reconstruction , 1987, Deep Learning for EEG-Based Brain–Computer Interfaces.

[75]  David A. Forsyth,et al.  Capturing and animating occluded cloth , 2007, ACM Trans. Graph..

[76]  Adrian Hilton,et al.  The Multiple-Camera 3-D Production Studio , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[77]  Roberto Cipolla,et al.  Multi-view stereo via volumetric graph-cuts , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[78]  Jessica K. Hodgins,et al.  Capturing and animating skin deformation in human motion , 2006, SIGGRAPH '06.

[79]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[80]  Ioannis Pitas,et al.  3-D Head Pose Estimation in Monocular Video Sequences Using Deformable Surfaces and Radial Basis Functions , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[81]  Olivier D. Faugeras,et al.  Multi-View Stereo Reconstruction and Scene Flow Estimation with a Global Image-Based Matching Score , 2007, International Journal of Computer Vision.

[82]  Marc Alexa,et al.  Differential coordinates for local mesh morphing and deformation , 2003, The Visual Computer.

[83]  Sivan Toledo,et al.  High-Pass Quantization for Mesh Encoding , 2003, Symposium on Geometry Processing.

[84]  Roberto Cipolla,et al.  Reconstructing relief surfaces , 2008, Image and Vision Computing.