A Real-time Scalable Object Detection System using Low-power HOG Accelerator VLSI

As described in this paper, a real-time object detection system using a Histogram of Oriented Gradients (HOG) feature extraction accelerator VLSI is presented. The VLSI [1, 2] enables the system to achieve real-time performance and scalability for multiple object detection under limited power condition. The VLSI employs three techniques: a VLSI-oriented HOG algorithm with early classification in Support Vector Machine (SVM) classification, a dual-core architecture for parallel feature extraction, and a detection-window-size scalable architecture with a reconfigurable MAC array for processing objects of different shapes. The test chip was fabricated using 65 nm CMOS technology. The measurement result shows that the VLSI consumes 43 mW at 42.9 MHz and 1.1 V to process HDTV (1920 × 1080 pixels) at 30 frames per second (fps). A multiple object detection system and a multiple scale object detection system are presented to demonstrate the system flexibility and scalability realized by VLSI and applicability for versatile application of object detection. On the multiple object detection system, a real-time object detection for HDTV resolution video is achieved with 84 mW of power consumption on a task to detect 2 types of targets while keeping comparable detection accuracy as software-based system. On the multiple scale object detection system, a task to detect 5 scales of a target is accomplished using a single VLSI. The power consumption of the VLSI is estimated to 102 mW on the task.

[1]  Hironobu Fujiyoshi,et al.  FPGA Hardware with Target-Reconfigurable Object Detector , 2015, IEICE Trans. Inf. Syst..

[2]  Shintaro Izumi,et al.  A sub-100-milliwatt dual-core HOG accelerator VLSI for real-time multiple object detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Ryusuke Miyamoto,et al.  Hardware architecture for high-accuracy real-time pedestrian detection with CoHOG features , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[4]  Jack E. Volder The CORDIC Trigonometric Computing Technique , 1959, IRE Trans. Electron. Comput..

[5]  Luis Salgado,et al.  Video analysis-based vehicle detection and tracking using an MCMC sampling framework , 2012, EURASIP J. Adv. Signal Process..

[6]  Shintaro Izumi,et al.  A Sub-100 mW Dual-Core HOG Accelerator VLSI for Parallel Feature Extraction Processing for HDTV Resolution Video , 2013, IEICE Trans. Electron..

[7]  Guang Deng,et al.  Real-Time Vision-Based Stop Sign Detection System on FPGA , 2008, 2008 Digital Image Computing: Techniques and Applications.

[8]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Ramakant Nevatia,et al.  Efficient scan-window based object detection using GPGPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Ulrich Brunsmann,et al.  FPGA-GPU architecture for kernel SVM pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[11]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[12]  Hiroyuki Ochi,et al.  Hardware Architecture for HOG Feature Extraction , 2009, 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[13]  S. Bauer,et al.  FPGA Implementation of a HOG-based Pedestrian Recognition System , 2010 .

[14]  Yuichiro Shibata,et al.  Deep pipelined one-chip FPGA implementation of a real-time image-based human detection algorithm , 2011, 2011 International Conference on Field-Programmable Technology.

[15]  Shintaro Izumi,et al.  Architectural Study of HOG Feature Extraction Processor for Real-Time Object Detection , 2012, 2012 IEEE Workshop on Signal Processing Systems.