Robust visual pedestrian detection by tight coupling to tracking

In this article, we propose a visual pedestrian detection system which couples pedestrian appearance and pedestrian motion in a Bayesian fashion, with the goal of making detection more invariant to appearance changes. More precisely, the system couples dense appearance-based pedestrian likelihoods derived from a sliding-window SVM detector to spatial prior distributions obtained from the prediction step of a particle filter based pedestrian tracker. This mechanism, which we term dynamic attention priors (DAP), is inspired by recent results on predictive visual attention in humans and can be implemented at negligible computational cost. We prove experimentally, using a set of public, annotated pedestrian sequences, that detection performance is improved significantly, especially in cases where pedestrians differ from the learned models, e.g., when they are too small, have an unusual pose or occur before strongly structured backgrounds. In particular, dynamic attention priors allow to use more restrictive detection thresholds without losing detections while minimizing false detections.

[1]  Michael W. Spratling Predictive coding as a model of biased competition in visual attention , 2008, Vision Research.

[2]  Jeff B. Pelz,et al.  Predictive eye movements in natural vision , 2011, Experimental Brain Research.

[3]  Pietro Perona,et al.  The Fastest Pedestrian Detector in the West , 2010, BMVC.

[4]  Simone Frintrop,et al.  Goal-Directed Search with a Top-Down Modulated Computational Attention System , 2005, DAGM-Symposium.

[5]  Antonio Torralba,et al.  Contextual Priming for Object Detection , 2003, International Journal of Computer Vision.

[6]  Heiko Wersing,et al.  System approach for multi-purpose representations of traffic scene elements , 2010, 13th International IEEE Conference on Intelligent Transportation Systems.

[7]  E. Rolls,et al.  A Neurodynamical cortical model of visual attention and invariant object recognition , 2004, Vision Research.

[8]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[9]  Hedvig Kjellström,et al.  Multi-target particle filtering for the probability hypothesis density , 2003, ArXiv.

[10]  Dariu Gavrila,et al.  Multi-cue pedestrian classification with partial occlusion handling , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Jannik Fritsch,et al.  A Hierarchical System Integration Approach with Application to Visual Scene Exploration for Driver Assistance , 2009, ICVS.

[12]  Jannik Fritsch,et al.  Towards a Human-like Vision System for R esource-Constrained Intelligent Cars , 2007 .

[13]  Philippe Martinet,et al.  Cars detection and tracking with a vision sensor , 2003, IEEE IV2003 Intelligent Vehicles Symposium. Proceedings (Cat. No.03TH8683).

[14]  D. Simon Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches , 2006 .

[15]  Tarak Gandhi,et al.  Pedestrian Protection Systems: Issues, Survey, and Challenges , 2007, IEEE Transactions on Intelligent Transportation Systems.

[16]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  R.J. Evans,et al.  Multi-target tracking in clutter without measurement assignment , 2008, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).

[18]  Alex Zelinsky,et al.  Learning OpenCV---Computer Vision with the OpenCV Library (Bradski, G.R. et al.; 2008)[On the Shelf] , 2009, IEEE Robotics & Automation Magazine.

[19]  Fred H Hamker,et al.  Modeling feature-based attention as an active top-down inference process. , 2006, Bio Systems.

[20]  J. Wolfe,et al.  Guided Search 2.0 A revised model of visual search , 1994, Psychonomic bulletin & review.

[21]  Jannik Fritsch,et al.  Biased Competition in Visual Processing Hierarchies: A Learning Approach Using Multiple Cues , 2011, Cognitive Computation.

[22]  Gary R. Bradski,et al.  Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library , 2016 .

[23]  Dariu Gavrila,et al.  Integrated pedestrian classification and orientation estimation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[24]  Laurent Itti,et al.  An Integrated Model of Top-Down and Bottom-Up Attention for Optimizing Detection Speed , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Luc Van Gool,et al.  Dynamic 3D Scene Analysis from a Moving Vehicle , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  C. Stiller,et al.  Vehicle detection fusing 2D visual features , 2004, IEEE Intelligent Vehicles Symposium, 2004.

[27]  C. Rabe,et al.  Kalman filter based depth from motion with fast convergence , 2005, IEEE Proceedings. Intelligent Vehicles Symposium, 2005..

[28]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[29]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.