A Bayesian approach to simultaneously recover camera pose and non-rigid shape from monocular images

In this paper we bring the tools of the Simultaneous Localization and Map Building (SLAM) problem from a rigid to a deformable domain and use them to simultaneously recover the 3D shape of non-rigid surfaces and the sequence of poses of a moving camera. Under the assumption that the surface shape may be represented as a weighted sum of deformation modes, we show that the problem of estimating the modal weights along with the camera poses, can be probabilistically formulated as a maximum a posteriori estimate and solved using an iterative least squares optimization. In addition, the probabilistic formulation we propose is very general and allows introducing different constraints without requiring any extra complexity. As a proof of concept, we show that local inextensibility constraints that prevent the surface from stretching can be easily integrated.An extensive evaluation on synthetic and real data, demonstrates that our method has several advantages over current non-rigid shape from motion approaches. In particular, we show that our solution is robust to large amounts of noise and outliers and that it does not need to track points over the whole sequence nor to use an initialization close from the ground truth.

[1]  John C. Platt,et al.  Elastically deformable models , 1987, SIGGRAPH.

[2]  Vincent Lepetit,et al.  Closed-Form Solution to Non-rigid 3D Surface Registration , 2008, ECCV.

[3]  Qiang Wang,et al.  Real Time Feature Based 3-D Deformable Face Tracking , 2008, ECCV.

[4]  Enrique Muñoz,et al.  A direct approach for efficiently tracking with 3D morphable models , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Peter Cheeseman,et al.  On the Representation and Estimation of Spatial Uncertainty , 1986 .

[6]  Francesc Moreno-Noguer,et al.  Stochastic Exploration of Ambiguities for Nonrigid Shape Recovery , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Matthew Brand,et al.  Morphable 3D models from video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[8]  Matthew R. Walter,et al.  Exactly Sparse Extended Information Filters for Feature-based SLAM , 2007, Int. J. Robotics Res..

[9]  Timothy A. Davis,et al.  Multifrontral multithreaded rank-revealing sparse QR factorization , 2009, Combinatorial Scientific Computing.

[10]  Alessio Del Bue,et al.  Factorization for non-rigid and articulated structure using metric projections , 2009, CVPR.

[11]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Serge Belongie,et al.  Linear embeddings in non-rigid structure from motion , 2009, CVPR.

[13]  Frank Dellaert,et al.  Square Root SAM: Simultaneous Localization and Mapping via Square Root Information Smoothing , 2006, Int. J. Robotics Res..

[14]  Francesc Moreno-Noguer,et al.  Sequential Non-Rigid Structure from Motion Using Physical Priors , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Francesc Moreno-Noguer,et al.  Exploring Ambiguities for Monocular Non-rigid Shape Estimation , 2010, ECCV.

[16]  Alex Pentland,et al.  Closed-form solutions for physically-based shape modeling and recognition , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  René Vidal,et al.  Perspective Nonrigid Shape and Motion Recovery , 2008, ECCV.

[18]  J. M. M. Montiel,et al.  FEM models to code non-rigid EKF monocular SLAM , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[19]  Adrien Bartoli,et al.  Coarse-to-fine low-rank structure-from-motion , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Ingemar J. Cox,et al.  Dynamic Map Building for an Autonomous Mobile Robot , 1992 .

[21]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[22]  Jessica K. Hodgins,et al.  Estimating cloth simulation parameters from video , 2003, SCA '03.

[23]  Michael J. Black,et al.  Estimating human shape and pose from a single image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Kiriakos N. Kutulakos,et al.  Semidefinite Programming Heuristics for Surface Reconstruction Ambiguities , 2008, ECCV.

[25]  Jing Xiao,et al.  Uncalibrated perspective reconstruction of deformable structures , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[26]  Francesc Moreno-Noguer,et al.  Simultaneous pose, correspondence and non-rigid shape , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[28]  Wolfram Burgard,et al.  A Tree Parameterization for Efficiently Computing Maximum Likelihood Maps using Gradient Descent , 2007, Robotics: Science and Systems.

[29]  Laurent D. Cohen,et al.  Finite-Element Methods for Active Contour Models and Balloons for 2-D and 3-D Images , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[31]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[32]  Kiriakos N. Kutulakos,et al.  Non-rigid structure from locally-rigid motion , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Francesc Moreno-Noguer,et al.  Probabilistic simultaneous pose and non-rigid shape recovery , 2011, CVPR 2011.

[34]  Frank Dellaert,et al.  iSAM: Incremental Smoothing and Mapping , 2008, IEEE Transactions on Robotics.

[35]  Adrien Bartoli,et al.  Monocular Template-based Reconstruction of Inextensible Surfaces , 2011, International Journal of Computer Vision.

[36]  Aram Kawewong,et al.  Online and Incremental Appearance-based SLAM in Highly Dynamic Environments , 2011, Int. J. Robotics Res..

[37]  John J. Leonard,et al.  Robust Mapping and Localization in Indoor Environments Using Sonar Data , 2002, Int. J. Robotics Res..

[38]  Warren E. Dixon,et al.  Structure estimation of a moving object using a moving camera: An unknown input observer approach , 2011, IEEE Conference on Decision and Control and European Control Conference.

[39]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[40]  J. M. M. Montiel,et al.  Finite Element based sequential Bayesian Non-Rigid Structure from Motion , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Frank Dellaert,et al.  Incremental smoothing and mapping , 2008 .

[42]  Aaron Hertzmann,et al.  Nonrigid Structure-from-Motion: Estimating Shape and Motion with Hierarchical Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Dimitris N. Metaxas,et al.  Shape and Nonrigid Motion Estimation Through Physics-Based Synthesis , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  Vincent Lepetit,et al.  Accurate Non-Iterative O(n) Solution to the PnP Problem , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[45]  Vincent Lepetit,et al.  Capturing 3D stretchable surfaces from single images in closed form , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Henning Biermann,et al.  Recovering non-rigid 3D shape from image streams , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[47]  Dimitris N. Metaxas,et al.  Constrained deformable superquadrics and nonrigid motion tracking , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48]  Pascal Fua,et al.  Reconstructing sharply folding surfaces: A convex formulation , 2009, CVPR.

[49]  Timothy A. Davis,et al.  Algorithm 915, SuiteSparseQR: Multifrontal multithreaded rank-revealing sparse QR factorization , 2011, TOMS.

[50]  Dmitry B. Goldgof,et al.  Nonrigid motion analysis based on dynamic refinement of finite element models , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[51]  Demetri Terzopoulos,et al.  A finite element model for 3D shape reconstruction and nonrigid motion tracking , 1993, 1993 (4th) International Conference on Computer Vision.

[52]  Edwin Olson,et al.  Fast iterative alignment of pose graphs with poor initial estimates , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[53]  Alessio Del Bue,et al.  Piecewise Quadratic Reconstruction of Non-Rigid Surfaces from Monocular Sequences , 2010, ECCV.

[54]  Tom Duckett,et al.  Experimental Analysis of Sample-Based Maps for Long-Term SLAM , 2009, Int. J. Robotics Res..