论文信息 - Monocular Dense 3D Reconstruction of a Complex Dynamic Scene from Two Perspective Frames

Monocular Dense 3D Reconstruction of a Complex Dynamic Scene from Two Perspective Frames

This paper proposes a new approach for monocular dense 3D reconstruction of a complex dynamic scene from two perspective frames. By applying superpixel over-segmentation to the image, we model a generically dynamic (hence non-rigid) scene with a piecewise planar and rigid approximation. In this way, we reduce the dynamic reconstruction problem to a “3D jigsaw puzzle ” problem which takes pieces from an unorganized “soup of superpixels". We show that our method provides an effective solution to the inherent relative scale ambiguity in structure-from-motion. Since our method does not assume a template prior, or per-object segmentation, or knowledge about the rigidity of the dynamic scene, it is applicable to a wide range of scenarios. Extensive experiments on both synthetic and real monocular sequences demonstrate the superiority of our method compared with the state-of-the-art methods.

[1] Pascal Fua,et al. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Andrew Blake,et al. Surface descriptions from stereo and shading , 1986, Image Vis. Comput..

[3] Pascal Fua,et al. A constrained latent variable model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Rui Yu,et al. Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes , 2014, ECCV.

[5] Vladimir Kolmogorov,et al. Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6] Takeo Igarashi,et al. As-rigid-as-possible shape manipulation , 2005, SIGGRAPH '05.

[7] Andreas Geiger,et al. Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Cordelia Schmid,et al. Learning object class detectors from weakly annotated video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9] Hongdong Li,et al. A Simple Prior-Free Method for Non-rigid Structure-from-Motion Factorization , 2012, International Journal of Computer Vision.

[10] Jitendra Malik,et al. Grouping-Based Low-Rank Trajectory Completion and 3D Reconstruction , 2014, NIPS.

[11] Rui Yu,et al. Direct, Dense, and Deformable: Template-Based Non-rigid 3D Reconstruction from RGB Video , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12] Takeo Kanade,et al. Trajectory Space: A Dual Representation for Nonrigid Structure from Motion , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Pascal Fua,et al. Template-free monocular reconstruction of deformable surfaces , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14] Qiao Wang,et al. VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Didier Stricker,et al. Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16] Andreas Geiger,et al. Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[17] Michael J. Black,et al. A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[18] Robert J. Vanderbei,et al. Interior-Point Methods for Nonconvex Nonlinear Programming: Filter Methods and Merit Functions , 2002, Comput. Optim. Appl..

[19] Lourdes Agapito,et al. Dense Variational Reconstruction of Non-rigid Surfaces from Monocular Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[20] Michael J. Black,et al. MoSh: motion and shape capture from sparse markers , 2014, ACM Trans. Graph..

[21] Fei Wang,et al. Template-Free 3D Reconstruction of Poorly-Textured Nonrigid Surfaces , 2016, ECCV.

[22] Marc Pollefeys,et al. Joint 3D Scene Reconstruction and Class Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23] Hongdong Li,et al. Spatio-temporal union of subspaces for multi-body non-rigid structure-from-motion , 2017, Pattern Recognit..

[24] Vladlen Koltun,et al. Dense Monocular Depth Estimation in Complex Dynamic Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Ce Liu,et al. Depth Transfer: Depth Extraction from Video Using Non-Parametric Sampling , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26] Hongdong Li,et al. Multi-Body Non-Rigid Structure-from-Motion , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[27] Alessio Del Bue,et al. Piecewise Quadratic Reconstruction of Non-Rigid Surfaces from Monocular Sequences , 2010, ECCV.

[28] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[29] Kiriakos N. Kutulakos,et al. Non-rigid structure from locally-rigid motion , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.