Multimodal information fusion for urban scene understanding

This paper addresses the problem of scene understanding for driver assistance systems. To recognize the large number of objects that may be found on the road, several sensors and decision algorithms have to be used. The proposed approach is based on the representation of all available information in over-segmented image regions. The main novelty of the framework is its capability to incorporate new classes of objects and to include new sensors or detection methods while remaining robust to sensor failures. Several classes such as ground, vegetation or sky are considered, as well as three different sensors. The approach was evaluated on real publicly available urban driving scene data.

[1]  Prakash P. Shenoy,et al.  On the plausibility transformation method for translating belief function models to probability models , 2006, Int. J. Approx. Reason..

[2]  Luc Van Gool,et al.  Dynamic 3D Scene Analysis from a Moving Vehicle , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Joachim Denzler,et al.  Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition , 2013, Machine Vision and Applications.

[4]  Thierry Denoeux,et al.  Analysis of evidence-theoretic decision rules for pattern classification , 1997, Pattern Recognit..

[5]  Thierry Denoeux,et al.  Classifier fusion in the Dempster-Shafer framework using optimized t-norm based combination rules , 2011, Int. J. Approx. Reason..

[6]  Huijing Zhao,et al.  Information Fusion on Oversegmented Images: An Application for Urban Scene Understanding , 2013, MVA.

[7]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[9]  W. F. Clocksin,et al.  Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction , 2011, International Journal of Computer Vision.

[10]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[11]  Alexei A. Efros,et al.  Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[12]  Fakhri Karray,et al.  Multisensor data fusion: A review of the state-of-the-art , 2013, Inf. Fusion.

[13]  Hugh F. Durrant-Whyte,et al.  Simultaneous Localization, Mapping and Moving Object Tracking , 2007, Int. J. Robotics Res..

[14]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Mayank Bansal,et al.  A real-time pedestrian detection system based on structure and appearance classification , 2010, 2010 IEEE International Conference on Robotics and Automation.

[16]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[17]  Luc Van Gool,et al.  Segmentation-Based Urban Traffic Scene Understanding , 2009, BMVC.

[18]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[19]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[20]  Ian Reid,et al.  gSLIC: a real-time implementation of SLIC superpixel segmentation , 2011 .

[21]  Philippe Smets,et al.  Belief functions: The disjunctive rule of combination and the generalized Bayesian theorem , 1993, Int. J. Approx. Reason..

[22]  Daniel Cremers,et al.  B-Spline Modeling of Road Surfaces With an Application to Free-Space Estimation , 2009, IEEE Transactions on Intelligent Transportation Systems.

[23]  W. F. Clocksin,et al.  Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction , 2012, International Journal of Computer Vision.

[24]  Alessandro Saffiotti,et al.  The Transferable Belief Model , 1991, ECSQARU.

[25]  Jeffrey A. Barnett,et al.  Calculating Dempster-Shafer Plausibility , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Thierry Denoeux,et al.  Evidential Grammars for Image Interpretation - Application to Multimodal Traffic Scene Understanding , 2013, IUKM.

[27]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[29]  Didier Dubois,et al.  A definition of subjective possibility , 2008, Int. J. Approx. Reason..

[30]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Andreas Geiger,et al.  Efficient Large-Scale Stereo Matching , 2010, ACCV.

[32]  Bernt Schiele,et al.  A Dynamic Conditional Random Field Model for Joint Labeling of Object and Scene Classes , 2008, ECCV.

[33]  Horst Bischof,et al.  A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.

[34]  P. Walley Statistical Reasoning with Imprecise Probabilities , 1990 .

[35]  Véronique Berge-Cherfaoui,et al.  Multi-modal object detection and localization for high integrity driving assistance , 2014, Machine Vision and Applications.

[36]  Rudolf Mester,et al.  Free Space Computation Using Stochastic Occupancy Grids and Dynamic Programming , 2008 .