Mid-level Segmentation and Segment Tracking for Long-Range Stereo Analysis

This paper presents a novel way of combining dense stereo and motion analysis for the purpose of mid-level scene segmentation and object tracking. The input is video data that addresses long-range stereo analysis, as typical when recording traffic scenes from a mobile platform. The task is to identify shapes of traffic-relevant objects without aiming at object classification at the considered stage. We analyse disparity dynamics in recorded scenes for solving this task. Statistical shape models are generated over subsequent frames. Shape correspondences are established by using a similarity measure based on set theory. The motion of detected shapes (frame to frame) is compensated by using a dense motion field as produced by a real-time optical flow algorithm. Experimental results show the quality of the proposed method which is fairly simple to implement.

[1]  U. Franke,et al.  B-spline modeling of road surfaces for freespace estimation , 2008, 2008 IEEE Intelligent Vehicles Symposium.

[2]  Liang Zhao,et al.  Stereo- and neural network-based pedestrian detection , 2000, IEEE Trans. Intell. Transp. Syst..

[3]  Heiko Hirschm,et al.  Accurate and Efcient Stereo Processing by Semi-Global Matching and Mutual Information , 2005 .

[4]  A. Zelinsky,et al.  Towards Safer Roads by Integration of Road Scene Monitoring and Vehicle Control , 2006, Int. J. Robotics Res..

[5]  T. Vaudrey,et al.  Differences between stereo and motion behaviour on synthetic and real-world stereo sequences , 2008, 2008 23rd International Conference Image and Vision Computing New Zealand.

[6]  Hernán Badino,et al.  A Robust Approach for Ego-Motion Estimation Using a Mobile Stereo Platform , 2004, IWCM.

[7]  Horst Bischof,et al.  A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.

[8]  Helder Araújo,et al.  A Stereovision Method for Obstacle Detection and Tracking in Non-Flat Urban Environments , 2005, Auton. Robots.

[9]  Jan-Olof Eklundh,et al.  Computer Vision — ECCV '94 , 1994, Lecture Notes in Computer Science.

[10]  Sergiu Nedevschi,et al.  Real-time semi-global dense stereo solution with improved sub-pixel accuracy , 2010, 2010 IEEE Intelligent Vehicles Symposium.

[11]  Daniel Cremers,et al.  B-Spline Modeling of Road Surfaces With an Application to Free-Space Estimation , 2009, IEEE Transactions on Intelligent Transportation Systems.

[12]  Heiko Hirschmüller,et al.  Evaluation of Stereo Matching Costs on Images with Radiometric Differences , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[14]  Wolfgang Förstner,et al.  Probabilistic Multi-class Scene Flow Segmentation for Traffic Scenes , 2010, DAGM-Symposium.

[15]  Reinhard Klette,et al.  Performance of Correspondence Algorithms in Vision-Based Driver Assistance Using an Online Image Sequence Database , 2011, IEEE Transactions on Vehicular Technology.

[16]  Peter Wegner,et al.  A technique for counting ones in a binary computer , 1960, CACM.

[17]  Daniel Cremers,et al.  Detection and Segmentation of Independently Moving Objects from Dense Scene Flow , 2009, EMMCVPR.

[18]  Jean-Philippe Tarel,et al.  Real time obstacle detection in stereovision on non flat road geometry through "v-disparity" representation , 2002, Intelligent Vehicle Symposium, 2002. IEEE.

[19]  David Gerónimo Gómez A Global Approach to Vision-Based Pedestrian Detection for Advanced Driver Assistance Systems , 2010 .

[20]  Jiří Matas,et al.  Computer Vision - ECCV 2004 , 2004, Lecture Notes in Computer Science.

[21]  Daniel Cremers,et al.  Efficient Dense Scene Flow from Sparse or Dense Stereo Data , 2008, ECCV.

[22]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[23]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[24]  Uwe Franke,et al.  6D-Vision: Fusion of Stereo and Motion for Robust Environment Perception , 2005, DAGM-Symposium.

[25]  Uwe Franke,et al.  The Stixel World - A Compact Medium Level Representation of the 3D-World , 2009, DAGM-Symposium.

[26]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[27]  Masatoshi Okutomi,et al.  An analysis of sub-pixel estimation error on area-based image matching , 2002, 2002 14th International Conference on Digital Signal Processing Proceedings. DSP 2002 (Cat. No.02TH8628).