Integrating Recognition and Reconstruction for Cognitive Traffic Scene Analysis from a Moving Vehicle

This paper presents a practical system for vision-based traffic scene analysis from a moving vehicle based on a cognitive feedback loop which integrates real-time geometry estimation with appearance-based object detection. We demonstrate how those two components can benefit from each other's continuous input and how the transferred knowledge can be used to improve scene analysis. Thus, scene interpretation is not left as a matter of logical reasoning, but is instead addressed by the repeated interaction and consistency checks between different levels and modes of visual processing. As our results show, the proposed tight integration significantly increases recognition performance, as well as overall system robustness. In addition, it enables the construction of novel capabilities such as the accurate 3D estimation of object locations and orientations and their temporal integration in a world coordinate frame. The system is evaluated on a challenging real-world car detection task in an urban scenario.

[1]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[2]  Bernt Schiele,et al.  Scale-Invariant Object Categorization Using a Scale-Adaptive Mean-Shift Search , 2004, DAGM-Symposium.

[3]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[4]  Andrew Zisserman,et al.  Multiple view geometry in computer visiond , 2001 .

[5]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[6]  Luc Van Gool,et al.  Fast Compact City Modeling for Navigation Pre-Visualization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Hans-Hellmut Nagel,et al.  Model-based object tracking in monocular image sequences of road traffic scenes , 1993, International Journal of Computer 11263on.

[9]  Dariu Gavrila,et al.  Real-time object detection for "smart" vehicles , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  L. Davis,et al.  Real-time multiple vehicle detection and tracking from a moving vehicle , 2000, Machine Vision and Applications.

[11]  Massimo Bertozzi,et al.  Vehicle detection and localization in infra-red images , 2002, Proceedings. The IEEE 5th International Conference on Intelligent Transportation Systems.

[12]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[13]  L. Davis,et al.  M2Tracker: A Multi-View Approach to Segmenting and Tracking People in a Cluttered Scene , 2003, International Journal of Computer Vision.

[14]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[15]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .