Pedestrian detection using GPU-accelerated multiple cue computation

Achieving accurate pedestrian detection for practically relevant scenarios in real-time is an important problem for many applications, while representing a major scientific challenge at the same time. In this paper we present an algorithmic framework which efficiently computes pedestrian-specific shape and motion cues and combines them in a probabilistic manner to infer the location and occlusion status of pedestrians viewed by a stationary camera. The articulated pedestrian shape is represented by a set of sparse contour templates, where fast template matching against image features is carried out using integral images built along oriented scan-lines. The motion cue is obtained by employing a non-parametric background model using the YCbCr color space. Both cues are computed and evaluated on the GPU. Given the probabilistic output from the two cues the spatial configuration of hypothesized human body locations is obtained by an iterative optimization scheme taking into account the depth ordering and occlusion status of individual hypotheses. The method achieves fast computation times even in complex scenarios with a high pedestrian density. Employed computational schemes are described in detail and the validity of the approach is demonstrated on three PETS2009 datasets depicting increasing pedestrian density. Evaluation results and comparison with state of the art are presented.

[1]  H. Bischof,et al.  Fast human detection in crowded scenes by contour integration and local shape estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Xiaogang Wang,et al.  Shape and Appearance Context Modeling , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Wei Huang,et al.  Detection and tracking of multiple moving objects in video , 2007, VISAPP.

[4]  Ramakant Nevatia,et al.  Bayesian human segmentation in crowded situations , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[5]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[6]  Mubarak Shah,et al.  A hierarchical approach to robust background subtraction using color and gradient information , 2002, Workshop on Motion and Video Computing, 2002. Proceedings..

[7]  Mubarak Shah,et al.  Detecting and segmenting humans in crowded scenes , 2007, ACM Multimedia.

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[10]  Fatih Murat Porikli,et al.  Human Detection via Classification on Riemannian Manifolds , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  David Schreiber,et al.  GPU-based non-parametric background subtraction for a practical surveillance system , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[12]  Yann LeCun,et al.  Boxlets: A Fast Convolution Algorithm for Signal Processing and Neural Networks , 1998, NIPS.

[13]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[14]  Pietro Perona,et al.  Pedestrian detection: A benchmark , 2009, CVPR.

[15]  Dariu Gavrila,et al.  Monocular Pedestrian Detection: Survey and Experiments , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Larry S. Davis,et al.  Real-time foreground-background segmentation using codebook model , 2005, Real Time Imaging.

[17]  Larry S. Davis,et al.  Non-parametric Model for Background Subtraction , 2000, ECCV.

[18]  Larry S. Davis,et al.  Hierarchical Part-Template Matching for Human Detection and Segmentation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[19]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[20]  Larry S. Davis,et al.  A Comprehensive Evaluation Framework and a Comparative Study for Human Detectors , 2009, IEEE Transactions on Intelligent Transportation Systems.

[21]  Larry S. Davis,et al.  Closely coupled object detection and segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[22]  Ramakant Nevatia,et al.  Detection and Tracking of Multiple, Partially Occluded Humans by Bayesian Combination of Edgelet based Part Detectors , 2007, International Journal of Computer Vision.

[23]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[24]  Dariu Gavrila,et al.  Real-time object detection for "smart" vehicles , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[25]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Manuele Bicego,et al.  Integrated region- and pixel-based approach to background modelling , 2002, Workshop on Motion and Video Computing, 2002. Proceedings..

[27]  Michael Harville,et al.  A Framework for High-Level Feedback to Adaptive, Per-Pixel, Mixture-of-Gaussian Background Models , 2002, ECCV.

[28]  Christopher H. Messom,et al.  Stream processing for fast and efficient rotated Haar-like features using rotated integral images , 2009, Int. J. Intell. Syst. Technol. Appl..