Hardware architecture for high-accuracy real-time pedestrian detection with CoHOG features

Co-occurrence histograms of oriented gradients (CoHOG) is a powerful feature descriptor for pedestrian detection. However, its calculation cost is large because the feature vector for the CoHOG descriptor is very high-dimensional. In this paper, in order to achieve real-time detection on embedded systems, we propose a novel hardware architecture for the CoHOG feature extraction. Our architecture exploits high degree of fine-grained parallelism and adopts an efficient histogram generator combined with a linear SVM classifier. The proposed architecture is implemented on a Xilinx Virtex-5 FPGA and it achieves real-time pedestrian detection on 38 fps 320×240 video. That is more than 100 times faster than the execution on a state-of-the-art Intel CPU.

[1]  Martial Hebert,et al.  Beyond Local Appearance: Category Recognition from Pairwise Interactions of Simple Features , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Shorin Kyo,et al.  IMAP-CE: a 51.2 GOPS video rate image processor with 128 VLIW processing elements , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[4]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[5]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Satoshi Ito,et al.  Co-occurrence Histograms of Oriented Gradients for Pedestrian Detection , 2009, PSIVT.

[7]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[8]  E. Rückert Detecting Pedestrians by Learning Shapelet Features , 2007 .

[9]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[10]  Andrew Blake,et al.  Multiscale Categorical Object Recognition Using Contour Fragments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Ryusuke Miyamoto,et al.  Partially Parallel Architecture for AdaBoost-Based Detection With Haar-Like Features , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Nobu Matsumoto,et al.  Design methodology and system for a configurable media embedded processor extensible to VLIW architecture , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[14]  Chih-Jen Lin,et al.  A dual coordinate descent method for large-scale linear SVM , 2008, ICML '08.

[15]  Fatih Murat Porikli,et al.  Pedestrian Detection via Classification on Riemannian Manifolds , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.