Multi-view Scene Flow Estimation: A View Centered Variational Approach

We present a novel method for recovering the 3D structure and scene flow from calibrated multi-view sequences. We propose a 3D point cloud parametrization of the 3D structure and scene flow that allows us to directly estimate the desired unknowns. A unified global energy functional is proposed to incorporate the information from the available sequences and simultaneously recover both depth and scene flow. The functional enforces multi-view geometric consistency and imposes brightness constancy and piecewise smoothness assumptions directly on the 3D unknowns. It inherently handles the challenges of discontinuities, occlusions, and large displacements. The main contribution of this work is the fusion of a 3D representation and an advanced variational framework that directly uses the available multi-view information. This formulation allows us to advantageously bind the 3D unknowns in time and space. Different from optical flow and disparity, the proposed method results in a nonlinear mapping between the images’ coordinates, thus giving rise to additional challenges in the optimization process. Our experiments on real and synthetic data demonstrate that the proposed method successfully recovers the 3D structure and scene flow despite the complicated nonconvex optimization problem.

[1]  Rui Li,et al.  Multi-Scale 3D Scene Flow from Binocular Stereo Sequences , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[2]  Daniel Cremers,et al.  Stereoscopic Scene Flow Computation for 3D Motion Understanding , 2011, International Journal of Computer Vision.

[3]  Kiriakos N. Kutulakos,et al.  Multi-view scene capture by surfel sampling: from video streams to non-rigid 3D motion, shape and reflectance , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[4]  Andrew W. Fitzgibbon,et al.  Global stereo reconstruction under second order smoothness priors , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[6]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[7]  Olivier D. Faugeras,et al.  Multi-View Stereo Reconstruction and Scene Flow Estimation with a Global Image-Based Matching Score , 2007, International Journal of Computer Vision.

[8]  Rachid Deriche,et al.  Dense Depth Map Reconstruction: A Minimization and Regularization Approach which Preserves Discontinuities , 1996, ECCV.

[9]  Jean Ponce,et al.  Dense 3D motion capture from synchronized video streams , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Stefano Soatto,et al.  Occlusion Detection and Motion Estimation with Convex Optimization , 2010, NIPS.

[11]  Yael Moses,et al.  Multi-view scene flow estimation: A view centered variational approach , 2010, CVPR.

[12]  Ye Zhang,et al.  Integrated 3D scene flow and structure recovery from multiview image sequences , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[13]  Daniel Cremers,et al.  Efficient Dense Scene Flow from Sparse or Dense Stereo Data , 2008, ECCV.

[14]  Kwanghoon Sohn,et al.  Edge-preserving Simultaneous Joint Motion-Disparity Estimation , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[15]  Daniel Cremers,et al.  A Convex Formulation of Continuous Multi-label Problems , 2008, ECCV.

[16]  Michael Isard,et al.  Dense Motion and Disparity Estimation Via Loopy Belief Propagation , 2006, ACCV.

[17]  Luc Van Gool,et al.  Dense matching of multiple wide-baseline views , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Richard Szeliski,et al.  High-accuracy stereo depth maps using structured light , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[19]  Kiriakos N. Kutulakos,et al.  Multi-View Scene Capture by Surfel Sampling: From Video Streams to Non-Rigid 3D Motion, Shape and Reflectance , 2002, International Journal of Computer Vision.

[20]  Pedro F. Felzenszwalb,et al.  Efficient belief propagation for early vision , 2004, CVPR 2004.

[21]  Jean-Philippe Pons,et al.  Dense and Accurate Spatio-temporal Multi-view Stereovision , 2009, ACCV.

[22]  Nir A. Sochen,et al.  Variational Stereo Vision with Sharp Discontinuities and Occlusion Handling , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  Takeo Kanade,et al.  Three-dimensional scene flow , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Ye Zhang,et al.  On 3D scene flow and structure estimation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[25]  Y. Aloimonos,et al.  Spatio-Temporal Stereo Using Multi-Resolution Subdivision Surfaces , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[26]  Frederic Devernay,et al.  A Variational Method for Scene Flow Estimation from Stereo Sequences , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[27]  Takeo Kanade,et al.  Shape and motion carving in 6D , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[28]  D. Young Iterative methods for solving partial difference equations of elliptic type , 1954 .