FPGA-based pedestrian detection under strong distortions

Pedestrian detection is one of the most popular computer vision challenges in the automotive, security and domotics industries, with several new approaches and benchmarks proposed every year. All of them typically consider the pedestrians in a standing pose, but this assumption is not always applicable. It is the case of embedded camera systems used for crowd monitoring or in driving assistance systems for big vehicles maneuvering. Such systems are commonly installed as higher as possible and make use of fish-eye lenses to provide a top and wide field of view. Actually, such configurations introduce both perspective and optical distortions in the image that, even when corrected, still provide stretched silhouettes that can hardly be detected by cutting-edge pedestrian detection algorithms. In this paper we focus on this scenario, showing (a) that one of the most effective models for pedestrian detection, that is the Deformable Part Model (DPM), can be efficiently implemented in FPGA to dramatically speed up the computation, and (b) how it can be modified for dealing with highly distorted pictures of humans. The resulting framework, dubbed Deformable Part Model for Local Spatial Deformations (DPM-LSD), gives convincing figure of merits in terms of accuracy and throughput, on a new top-view fish-eye based pedestrian dataset (dubbed Fish-Eyed Pedestrians), also comparing with widely-used competitors (standard DPM and Dalal-Triggs).

[1]  Hiroyuki Ochi,et al.  Hardware Architecture for HOG Feature Extraction , 2009, 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[2]  Alfred M. Bruckstein,et al.  Sub-pixel distance maps and weighted distance transforms , 1996, Journal of Mathematical Imaging and Vision.

[3]  Vittorio Murino,et al.  FPGA-based pedestrian detection using array of covariance features , 2011, 2011 Fifth ACM/IEEE International Conference on Distributed Smart Cameras.

[4]  Ryusuke Miyamoto,et al.  Hardware architecture for high-accuracy real-time pedestrian detection with CoHOG features , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[5]  Yuichiro Shibata,et al.  Deep pipelined one-chip FPGA implementation of a real-time image-based human detection algorithm , 2011, 2011 International Conference on Field-Programmable Technology.

[6]  Shinpei Kato,et al.  GPU implementations of object detection using HOG features and deformable models , 2013, 2013 IEEE 1st International Conference on Cyber-Physical Systems, Networks, and Applications (CPSNA).

[7]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[8]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Yao Wang,et al.  Human detection in fish-eye images using HOG-based detectors over rotated windows , 2014, 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[10]  Ulrich Brunsmann,et al.  FPGA-Based Real-Time Pedestrian Detection on High-Resolution Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[11]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Markus Hess,et al.  Object Detection and Classification Using a Rear In-Vehicle Fisheye Camera , 2013 .

[13]  Shintaro Izumi,et al.  Architectural Study of HOG Feature Extraction Processor for Real-Time Object Detection , 2012, 2012 IEEE Workshop on Signal Processing Systems.

[14]  S. Bauer,et al.  FPGA Implementation of a HOG-based Pedestrian Recognition System , 2010 .

[15]  Walid A. Najjar,et al.  High-Throughput Fixed-Point Object Detection on FPGAs , 2014, FCCM 2014.