A 58.6mW real-time programmable object detector with multi-scale multi-object support using deformable parts model on 1920×1080 video at 30fps

This paper presents a programmable, energy-efficient and real-time object detection accelerator using deformable parts models (DPM), with 2× higher accuracy than traditional rigid body models. With 8 deformable parts detection, three methods are used to address the high computational complexity: classification pruning for 33× fewer parts classification, vector quantization for 15× memory size reduction, and feature basis projection for 2× reduction of the cost of each classification. The chip is implemented in 65nm CMOS technology, and can process HD (1920×1080) images at 30fps without any off-chip storage while consuming only 58.6mW (0.94nJ/pixel, 1168 GOPS/W). The chip has two classification engines to simultaneously detect two different classes of objects. With a tested high throughput of 60fps, the classification engines can be time multiplexed to detect even more than two object classes. It is energy scalable by changing the pruning factor or disabling the parts classification.

[1]  David A. Forsyth,et al.  30Hz Object Detection with DPM V5 , 2014, ECCV.

[2]  Shintaro Izumi,et al.  A Real-time Scalable Object Detection System using Low-power HOG Accelerator VLSI , 2014, J. Signal Process. Syst..

[3]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Vivienne Sze,et al.  Energy-efficient HOG-based object detection at 1080HD 60 fps with multi-scale support , 2014, 2014 IEEE Workshop on Signal Processing Systems (SiPS).

[5]  Yutaka Yamada,et al.  18.2 A 1.9TOPS and 564GOPS/W heterogeneous multicore SoC with color-based object classification accelerator for image-recognition applications , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.