Object scene flow

Abstract This work investigates the estimation of dense three-dimensional motion fields, commonly referred to as scene flow. While great progress has been made in recent years, large displacements and adverse imaging conditions as observed in natural outdoor environments are still very challenging for current approaches to reconstruction and motion estimation. In this paper, we propose a unified random field model which reasons jointly about 3D scene flow as well as the location, shape and motion of vehicles in the observed scene. We formulate the problem as the task of decomposing the scene into a small number of rigidly moving objects sharing the same motion parameters. Thus, our formulation effectively introduces long-range spatial dependencies which commonly employed local rigidity priors are lacking. Our inference algorithm then estimates the association of image segments and object hypotheses together with their three-dimensional shape and motion. We demonstrate the potential of the proposed approach by introducing a novel challenging scene flow benchmark which allows for a thorough comparison of the proposed scene flow approach with respect to various baseline models. In contrast to previous benchmarks, our evaluation is the first to provide stereo and optical flow ground truth for dynamic real-world urban scenes at large scale. Our experiments reveal that rigid motion segmentation can be utilized as an effective regularizer for the scene flow problem, improving upon existing two-frame scene flow methods. At the same time, our method yields plausible object segmentations without requiring an explicitly trained recognition model for a specific object class.

[1]  Michael J. Black,et al.  A Quantitative Analysis of Current Practices in Optical Flow Estimation and the Principles Behind Them , 2013, International Journal of Computer Vision.

[2]  Tamir Hazan,et al.  Continuous Markov Random Fields for Robust Stereo Estimation , 2012, ECCV.

[3]  Vladimir Kolmogorov,et al.  Optimizing Binary MRFs via Extended Roof Duality , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Raquel Urtasun,et al.  Efficient Joint Segmentation, Occlusion Labeling, Stereo and Flow Estimation , 2014, ECCV.

[6]  Pushmeet Kohli,et al.  Object stereo — Joint stereo matching and object segmentation , 2011, CVPR 2011.

[7]  Yael Moses,et al.  Multi-view scene flow estimation: A view centered variational approach , 2010, CVPR.

[8]  David A. McAllester,et al.  Unsupervised Learning of Stereo Vision with Monocular Depth Cues , 2009, BMVC.

[9]  Yuri Boykov,et al.  Energy-Based Geometric Multi-model Fitting , 2012, International Journal of Computer Vision.

[10]  Jitendra Malik,et al.  Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Joachim Weickert,et al.  Joint Estimation of Motion, Structure and Geometry from Stereo Sequences , 2010, ECCV.

[12]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Andreas Geiger,et al.  Exploiting Object Similarity in 3D Reconstruction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Konrad Schindler,et al.  View-Consistent 3D Scene Flow Estimation over Multiple Frames , 2014, ECCV.

[15]  Aurélien Plyer,et al.  A Prediction-Correction Approach for Real-Time Optical Flow Computation Using Stereo , 2016, GCPR.

[16]  Konrad Schindler,et al.  3D scene flow estimation with a rigid motion prior , 2011, 2011 International Conference on Computer Vision.

[17]  Andreas Geiger,et al.  Displets: Resolving stereo ambiguities using object knowledge , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Frederic Devernay,et al.  A Variational Method for Scene Flow Estimation from Stereo Sequences , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[19]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[20]  Luc Van Gool,et al.  Integrating Recognition and Reconstruction for Cognitive Traffic Scene Analysis from a Moving Vehicle , 2006, DAGM-Symposium.

[21]  Paul Debevec,et al.  Modeling and Rendering Architecture from Photographs , 1996, SIGGRAPH 1996.

[22]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[23]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Ian D. Reid,et al.  Dense Reconstruction Using 3D Object Shape Priors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Konrad Schindler,et al.  Towards Scene Understanding with Detailed 3D Object Representations , 2014, International Journal of Computer Vision.

[26]  Uwe Franke,et al.  Dense, Robust, and Accurate Motion Field Estimation from Stereo Image Sequences in Real-Time , 2010, ECCV.

[27]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[28]  Julius Ziegler,et al.  StereoScan: Dense 3d reconstruction in real-time , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[29]  Konrad Schindler,et al.  Piecewise Rigid Scene Flow , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Christian Heipke,et al.  Joint 3d Estimation of Vehicles and Scene Flow , 2015 .

[31]  Martin Lauer,et al.  3D Traffic Scene Understanding From Movable Platforms , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Silvio Savarese,et al.  Dense Object Reconstruction with Semantic Priors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Cordelia Schmid,et al.  EpicFlow: Edge-preserving interpolation of correspondences for optical flow , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Dieter Fox,et al.  RGB-D flow: Dense 3-D motion estimation using color and depth , 2013, 2013 IEEE International Conference on Robotics and Automation.

[35]  Andrew W. Fitzgibbon,et al.  SphereFlow: 6 DoF Scene Flow from RGB-D Pairs , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Takeo Kanade,et al.  Three-dimensional scene flow , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Christian Heipke,et al.  Discrete Optimization for Optical Flow , 2015, GCPR.

[39]  Olivier D. Faugeras,et al.  Multi-View Stereo Reconstruction and Scene Flow Estimation with a Global Image-Based Matching Score , 2007, International Journal of Computer Vision.

[40]  Frank Dellaert,et al.  A Continuous Optimization Approach for Efficient and Accurate Scene Flow , 2016, ECCV.

[41]  Thomas Brox,et al.  Dense Semi-rigid Scene Flow Estimation from RGBD Images , 2014, ECCV.

[42]  Raquel Urtasun,et al.  Robust Monocular Epipolar Flow Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Bernt Schiele,et al.  Detailed 3D Representations for Object Recognition and Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[45]  Radu Horaud,et al.  Scene flow estimation by growing correspondence seeds , 2011, CVPR 2011.

[46]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[47]  Ian D. Reid,et al.  Simultaneous Monocular 2D Segmentation, 3D Pose Recovery and 3D Reconstruction , 2012, ACCV.

[48]  Andreas Geiger,et al.  Understanding High-Level Semantics by Modeling Traffic Patterns , 2013, 2013 IEEE International Conference on Computer Vision.

[49]  Luc Van Gool,et al.  Depth-From-Recognition: Inferring Meta-data by Cognitive Feedback , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[50]  Wolfgang Förstner,et al.  Models for photogrammetric building reconstruction , 1995, Comput. Graph..

[51]  Daniel Cremers,et al.  Efficient Dense Scene Flow from Sparse or Dense Stereo Data , 2008, ECCV.

[52]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[53]  Konrad Schindler,et al.  An Evaluation of Data Costs for Optical Flow , 2013, GCPR.

[54]  Bernd Jähne,et al.  Stereo Ground Truth with Error Bars , 2014, ACCV.

[55]  Konrad Schindler,et al.  3D Scene Flow Estimation with a Piecewise Rigid Scene Model , 2015, International Journal of Computer Vision.