Efficient discriminative multiresolution cascade for real-time human detection applications

Human detection is fundamental in many machine vision applications, like video surveillance, driving assistance, action recognition and scene understanding. However in most of these applications real-time performance is necessary and this is not achieved yet by current detection methods. This paper presents a new method for human detection based on a multiresolution cascade of Histograms of Oriented Gradients (HOG) that can highly reduce the computational cost of detection search without affecting accuracy. The method consists of a cascade of sliding window detectors. Each detector is a linear Support Vector Machine (SVM) composed of HOG features at different resolutions, from coarse at the first level to fine at the last one. In contrast to previous methods, our approach uses a non-uniform stride of the sliding window that is defined by the feature resolution and allows the detection to be incrementally refined as going from coarse-to-fine resolution. In this way, the speed-up of the cascade is not only due to the fewer number of features computed at the first levels of the cascade, but also to the reduced number of windows that need to be evaluated at the coarse resolution. Experimental results show that our method reaches a detection rate comparable with the state-of-the-art of detectors based on HOG features, while at the same time the detection search is up to 23 times faster.

[1]  Jordi Gonzàlez,et al.  High-Speed Human Detection Using a Multiresolution Cascade of Histograms of Oriented Gradients , 2009, IbPRIA.

[2]  Ramakant Nevatia,et al.  Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[3]  Mei-Chen Yeh,et al.  Fast Human Detection Using a Cascade of Histograms of Oriented Gradients , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Wei Zhang,et al.  Real-time Accurate Object Detection using Multiple Resolutions , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[6]  Fatih Murat Porikli,et al.  Human Detection via Classification on Riemannian Manifolds , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[9]  Sayan Mukherjee,et al.  Feature reduction and hierarchy of classifiers for fast object detection in video images , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[10]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11]  Nazli Ikizler-Cinbis,et al.  Learning actions from the Web , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Sami Romdhani,et al.  Efficient Face Detection by a Cascaded Support Vector Machine Using Haar-Like Features , 2004, DAGM-Symposium.

[13]  Dariu Gavrila,et al.  Real-time object detection for "smart" vehicles , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[14]  Tomaso A. Poggio,et al.  Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Daniel P. Huttenlocher,et al.  Efficient matching of pictorial structures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[16]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.