Unified situation modeling and understanding using hierarchical graphical Model

Situation modeling and understanding is a challenging task due to the scene complexity. In this work we use the extended Joint Directors of Laboratories model for sensor data fusion to solve this challenging task. We propose a non-directed graphical model to represent the global and local contextual dependencies (spatial, temporal, and semantic) between the objects (pedestrians, vehicles, and lanes) and the scene. We develop an inference algorithm to estimate the probability density function at each node of the graph in a bottom up top down approach using non-parametric belief propagation (NBP) scheme. The inferred objects are contextually consistent with respect to others objects and the scene.

[1]  Sungho Kim Visual Context-based Scene Interpretation in Indoor Environment , 2005 .

[2]  Zehang Sun,et al.  On-road vehicle detection: a review , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Hongbin Zha,et al.  A system of automated training sample generation for visual-based car detection , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Antonio Torralba,et al.  Nonparametric Scene Parsing via Label Transfer , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[6]  Angelos Amditis,et al.  Revisiting JDL model for automotive safety applications: the PF2 functional model , 2006, 2006 9th International Conference on Information Fusion.

[7]  Naoki Suganuma,et al.  Obstacle Detection Based on Occupancy Grid Maps Using Stereovision System , 2010, Int. J. Intell. Transp. Syst. Res..

[8]  Alberto Elfes,et al.  Using occupancy grids for mobile robot perception and navigation , 1989, Computer.

[9]  Martin Lauer,et al.  A generative model for 3D urban scene understanding from movable platforms , 2011, CVPR 2011.

[10]  Fadi Dornaika,et al.  Road Approximation in Euclidean and v -Disparity Space: A Comparative Study , 2007, EUROCAST.

[11]  J.J. Sudano Equivalence between belief theories and naive bayesian fusion for systems with independent evidential data: part I, the theory , 2003, Sixth International Conference of Information Fusion, 2003. Proceedings of the.

[12]  M. Hebert,et al.  Efficient temporal consistency for streaming video scene analysis , 2013, 2013 IEEE International Conference on Robotics and Automation.

[13]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Tsuhan Chen,et al.  Towards Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models , 2010, NIPS.

[15]  Uwe Franke,et al.  Efficient representation of traffic scenes by means of dynamic stixels , 2010, 2010 IEEE Intelligent Vehicles Symposium.

[16]  Friedrich M. Wahl,et al.  Hierarchical scene understanding for intelligent vehicles , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[17]  Ben Taskar,et al.  Graphical Models in a Nutshell , 2007 .

[18]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Evangelos E. Milios,et al.  Robot Pose Estimation in Unknown Environments by Matching 2D Range Scans , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[20]  I. Biederman,et al.  Scene perception: Detecting and judging objects undergoing relational violations , 1982, Cognitive Psychology.

[21]  Bernt Schiele,et al.  New features and insights for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Erik B. Sudderth Graphical models for visual object recognition and tracking , 2006 .

[23]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[24]  Serge J. Belongie,et al.  Object categorization using co-occurrence, location and appearance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Antonio Torralba,et al.  Learning hierarchical models of scenes, objects, and parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[26]  Sanja Fidler,et al.  Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Jiebo Luo,et al.  Scene Parsing Using Region-Based Generative Models , 2007, IEEE Transactions on Multimedia.

[28]  D. J. Barrett,et al.  Model-Data Fusion , 2003 .

[29]  Antonio Torralba,et al.  Context models and out-of-context objects , 2012, Pattern Recognit. Lett..

[30]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[31]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  R. Danescu,et al.  High accuracy stereo vision system for far distance obstacle detection , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[34]  Markus Enzweiler,et al.  Efficient Stixel-based object recognition , 2012, 2012 IEEE Intelligent Vehicles Symposium.

[35]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Hans P. Moravec Sensor Fusion in Certainty Grids for Mobile Robots , 1988, AI Mag..

[37]  Martial Hebert,et al.  Contextual classification with functional Max-Margin Markov Networks , 2009, CVPR.

[38]  H. Hirschmüller Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information , 2005, CVPR.

[39]  Larry S. Davis,et al.  A Pose-Invariant Descriptor for Human Detection and Segmentation , 2008, ECCV.

[40]  Serge J. Belongie,et al.  Context based object categorization: A critical survey , 2010, Comput. Vis. Image Underst..

[41]  Martial Hebert,et al.  Learning message-passing inference machines for structured prediction , 2011, CVPR 2011.

[42]  Jean-Philippe Tarel,et al.  Real time obstacle detection in stereovision on non flat road geometry through "v-disparity" representation , 2002, Intelligent Vehicle Symposium, 2002. IEEE.

[43]  Bernt Schiele,et al.  Monocular Visual Scene Understanding: Understanding Multi-Object Traffic Scenes , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Luc Van Gool,et al.  Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Steven S. Beauchemin,et al.  Real-time vehicle detection and tracking using stereo vision and multi-view AdaBoost , 2011, 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[46]  Antonio Torralba,et al.  A Tree-Based Context Model for Object Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Antonio Torralba,et al.  Using the forest to see the trees: exploiting context for visual object detection and localization , 2010, CACM.

[48]  Ramakant Nevatia,et al.  Robust multi-view car detection using unsupervised sub-categorization , 2009, 2009 Workshop on Applications of Computer Vision (WACV).

[49]  Sergiu Nedevschi,et al.  Road Surface and Obstacle Detection Based on Elevation Maps from Dense Stereo , 2007, 2007 IEEE Intelligent Transportation Systems Conference.

[50]  Dariu Gavrila,et al.  A Multilevel Mixture-of-Experts Framework for Pedestrian Classification , 2011, IEEE Transactions on Image Processing.