A Continuous Occlusion Model for Road Scene Understanding

We present a physically interpretable, continuous threedimensional (3D) model for handling occlusions with applications to road scene understanding. We probabilistically assign each point in space to an object with a theoretical modeling of the reflection and transmission probabilities for the corresponding camera ray. Our modeling is unified in handling occlusions across a variety of scenarios, such as associating structure from motion (SFM) point tracks with potentially occluding objects or modeling object detection scores in applications such as 3D localization. For point track association, our model uniformly handles static and dynamic objects, which is an advantage over motion segmentation approaches traditionally used in multibody SFM. Detailed experiments on the KITTI raw dataset show the superiority of the proposed method over both state-of-the-art motion segmentation and a baseline that heuristically uses detection bounding boxes for resolving occlusions. We also demonstrate how our continuous occlusion model may be applied to the task of 3D localization in road scenes.

[1]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[2]  David J. C. Mackay,et al.  Introduction to Monte Carlo Methods , 1998, Learning in Graphical Models.

[3]  Kenichi Kanatani,et al.  Motion segmentation by subspace separation and model selection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[4]  René Vidal,et al.  Motion Segmentation with Missing Data Using PowerFactorization and GPCA , 2004, CVPR.

[5]  Y. Weiss,et al.  Multibody factorization with uncertainty and missing data using the EM algorithm , 2004, CVPR 2004.

[6]  Takeo Kanade,et al.  A Multibody Factorization Method for Independently Moving Objects , 1998, International Journal of Computer Vision.

[7]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Ramakant Nevatia,et al.  Detection and Tracking of Multiple, Partially Occluded Humans by Bayesian Combination of Edgelet based Part Detectors , 2007, International Journal of Computer Vision.

[9]  Marc Pollefeys,et al.  A General Framework for Motion Segmentation: Independent, Articulated, Rigid, Non-rigid, Degenerate and Non-degenerate , 2006, ECCV.

[10]  Horst Bischof,et al.  A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.

[11]  René Vidal,et al.  A Benchmark for the Comparison of 3-D Motion Segmentation Algorithms , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  René Vidal,et al.  Segmenting Motions of Different Types by Unsupervised Manifold Clustering , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  René Vidal,et al.  Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Allen Y. Yang,et al.  Robust Algebraic Segmentation of Mixed Rigid-Body and Planar Motions from Two Views , 2010, International Journal of Computer Vision.

[15]  H. Katzgraber Introduction to Monte Carlo Methods , 2009, 0905.1629.

[16]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Silvio Savarese,et al.  Multiple Target Tracking in World Coordinate with Single, Minimally Calibrated Camera , 2010, ECCV.

[18]  Jitendra Malik,et al.  Object Segmentation by Long Term Analysis of Point Trajectories , 2010, ECCV.

[19]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Luc Van Gool,et al.  Multibody Structure-from-Motion in Practice , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Bohyung Han,et al.  Learning occlusion with likelihoods for visual tracking , 2011, 2011 International Conference on Computer Vision.

[22]  K. Madhava Krishna,et al.  Realtime multibody visual SLAM with a smoothly moving monocular camera , 2011, 2011 International Conference on Computer Vision.

[23]  Daphne Koller,et al.  A segmentation-aware object detection model with occlusion handling , 2011, CVPR 2011.

[24]  K. Madhava Krishna,et al.  Motion segmentation of multiple objects from a freely moving monocular camera , 2012, 2012 IEEE International Conference on Robotics and Automation.

[25]  Peter V. Gehler,et al.  3D2PM - 3D Deformable Part Models , 2012, ECCV.

[26]  Martial Hebert,et al.  Occlusion reasoning for object detection under arbitrary viewpoint , 2012, CVPR.

[27]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Konrad Schindler,et al.  Explicit Occlusion Modeling for 3D Object Class Representations , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Peter V. Gehler,et al.  Occlusion Patterns for Object Class Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Silvio Savarese,et al.  Object Detection by 3D Aspectlets and Occlusion Reasoning , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[31]  Bernt Schiele,et al.  Monocular Visual Scene Understanding: Understanding Multi-Object Traffic Scenes , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Konrad Schindler,et al.  Are Cars Just 3D Boxes? Jointly Estimating the 3D Shape of Multiple Objects , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Shiyu Song,et al.  Robust Scale Estimation in Real-Time Monocular SFM for Autonomous Driving , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Konrad Schindler,et al.  Towards Scene Understanding with Detailed 3D Object Representations , 2014, International Journal of Computer Vision.

[35]  Konrad Schindler,et al.  Continuous Energy Minimization for Multitarget Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Hans-Peter Seidel,et al.  A Versatile Scene Model with Differentiable Visibility Applied to Generative Pose Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Shiyu Song,et al.  Joint SFM and detection cues for monocular 3D localization in road scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).