Dense Variational Reconstruction of Non-rigid Surfaces from Monocular Video

This paper offers the first variational approach to the problem of dense 3D reconstruction of non-rigid surfaces from a monocular video sequence. We formulate non-rigid structure from motion (nrsfm) as a global variational energy minimization problem to estimate dense low-rank smooth 3D shapes for every frame along with the camera motion matrices, given dense 2D correspondences. Unlike traditional factorization based approaches to nrsfm, which model the low-rank non-rigid shape using a fixed number of basis shapes and corresponding coefficients, we minimize the rank of the matrix of time-varying shapes directly via trace norm minimization. In conjunction with this low-rank constraint, we use an edge preserving total-variation regularization term to obtain spatially smooth shapes for every frame. Thanks to proximal splitting techniques the optimization problem can be decomposed into many point-wise sub-problems and simple linear systems which can be easily solved on GPU hardware. We show results on real sequences of different objects (face, torso, beating heart) where, despite challenges in tracking, illumination changes and occlusions, our method reconstructs highly deforming smooth surfaces densely and accurately directly from video, without the need for any prior models or shape templates.

[1]  Lourdes Agapito,et al.  Dense Non-rigid Structure from Motion , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[2]  Danail Stoyanov,et al.  Stereoscopic Scene Flow for Robotic Assisted Minimally Invasive Surgery , 2012, MICCAI.

[3]  Robert Tibshirani,et al.  Spectral Regularization Algorithms for Learning Large Incomplete Matrices , 2010, J. Mach. Learn. Res..

[4]  Takeo Kanade,et al.  Trajectory Space: A Dual Representation for Nonrigid Structure from Motion , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Adrien Bartoli,et al.  Coarse-to-fine low-rank structure-from-motion , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Marc Pollefeys,et al.  The generalized trace-norm and its application to structure-from-motion problems , 2011, 2011 International Conference on Computer Vision.

[7]  Adrien Bartoli,et al.  Locally Planar and Affine Deformable Surface Reconstruction from Video , 2010, VMV.

[8]  Yaser Sheikh,et al.  3D Reconstruction of a Moving Point from a Series of 2D Projections , 2010, ECCV.

[9]  Alessio Del Bue,et al.  A factorization approach to structure from motion with shape priors , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Hongdong Li,et al.  A simple prior-free method for non-rigid structure-from-motion factorization , 2012, CVPR.

[11]  Yaser Sheikh,et al.  In defense of orthonormality constraints for nonrigid structure from motion , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Carlo Tomasi,et al.  Dense Lagrangian motion estimation with occlusions , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Matthew Brand,et al.  A direct method for 3D factorization of nonrigid motion observed in 2D , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Emmanuel J. Candès The power of convex relaxation: the surprising stories of matrix completion and compressed sensing , 2010, SODA '10.

[15]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[17]  Henning Biermann,et al.  Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[18]  Lourdes Agapito,et al.  A Variational Approach to Video Registration with Subspace Constraints , 2013, International Journal of Computer Vision.

[19]  Kiriakos N. Kutulakos,et al.  Non-rigid structure from locally-rigid motion , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Yaser Sheikh,et al.  In defense of orthonormality constraints for nonrigid structure from motion , 2009, CVPR.

[21]  Lourdes Agapito,et al.  Energy based multiple model fitting for non-rigid structure from motion , 2011, CVPR 2011.

[22]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[23]  Adrien Bartoli,et al.  On template-based reconstruction from a single view: Analytical solutions and proofs of well-posedness for developable, isometric and conformal surfaces , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  A. Bartoli,et al.  Locally Affine and Planar Deformable Surface Reconstruction from Video , 2010 .

[25]  Aaron Hertzmann,et al.  Nonrigid Structure-from-Motion: Estimating Shape and Motion with Hierarchical Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Aleix M. Martínez,et al.  Computing Smooth Time Trajectories for Camera and Deformable Shape in Structure from Motion with Occlusion , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Alessio Del Bue,et al.  Optimal Metric Projections for Deformable and Articulated Structure-from-Motion , 2011, International Journal of Computer Vision.

[28]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[29]  Horst Bischof,et al.  A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.

[30]  Lourdes Agapito,et al.  Dense multibody motion estimation and reconstruction from a handheld camera , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[31]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[32]  Hanspeter Pfister,et al.  Face transfer with multilinear models , 2005, SIGGRAPH 2005.

[33]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[34]  Daniel Pizarro-Perez,et al.  Feature-Based Deformable Surface Detection with Self-Occlusion Reasoning , 2011, International Journal of Computer Vision.

[35]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .