A Hardware-Oriented Algorithm for Ultra-High-Speed Object Detection

This paper describes a novel hardware-oriented algorithm that can be implemented on a field-programmable gate array in a high-speed vision platform for detection of multiple objects with clear texture information in images of $512\times 512$ pixels at 10 000 frames per second (fps) under complex background. The proposed algorithm is specially designed for devices with limited hardware resource for high-frame-rate, high-data-throughput, and high-parallelism processing of video streams with low latency. The proposed algorithm is based on the conventional histograms of oriented gradient (HOG) descriptor and support vector machine classifier algorithms. Considering the trade-off between speed and accuracy, many hardware-based optimization operations were implemented. The data throughput is nearly 29.30 Gbps while the latency for feature extraction is 0.76 us (61 clock period). After hardware-based image processing, the source image and the detected object features can be transferred to a personal computer for recording or post-processing at 10 000 fps. Several experiments were done to demonstrate the performance of our proposed algorithm for ultra-high-speed moving object detection with clear texture information in images.

[1]  Shintaro Izumi,et al.  Architectural Study of HOG Feature Extraction Processor for Real-Time Object Detection , 2012, 2012 IEEE Workshop on Signal Processing Systems.

[2]  Tadayoshi Aoyama,et al.  Simultaneous Vision-Based Shape and Motion Analysis of Cells Fast-Flowing in a Microchannel , 2015, IEEE Transactions on Automation Science and Engineering.

[3]  Idaku Ishii,et al.  Review of some advances and applications in real-time high-speed vision: Our views and experiences , 2016, Int. J. Autom. Comput..

[4]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Mandyam V. Srinivasan,et al.  WHoG: A weighted HoG-based scheme for the detection of birds and identification of their poses in natural environments , 2016, 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV).

[6]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[8]  Yuichiro Shibata,et al.  FPGA Implementation of Human Detection by HOG Features with AdaBoost , 2013, IEICE Trans. Inf. Syst..

[9]  Takeshi Takaki,et al.  Color-histogram-based tracking at 2000 fps , 2012, J. Electronic Imaging.

[10]  Ethan Rublee,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[11]  Marek Gorgon,et al.  Floating point HOG implementation for real-time multiple object detection , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[12]  Hiroyuki Ochi,et al.  Hardware Architecture for HOG Feature Extraction , 2009, 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[13]  Peter H. N. de With,et al.  Color exploitation in hog-based traffic sign detection , 2010, 2010 IEEE International Conference on Image Processing.

[14]  Jun Chen,et al.  Super high-speed vision platform for processing 1024×1024 images in real time at 12500 fps , 2016, 2016 IEEE/SICE International Symposium on System Integration (SII).

[15]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[16]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[17]  Chong-Wah Ngo,et al.  Flip-Invariant SIFT for Copy and Object Detection , 2013, IEEE Transactions on Image Processing.

[18]  Tadayoshi Aoyama,et al.  LOC-Based High-Throughput Cell Morphology Analysis System , 2015, IEEE Transactions on Automation Science and Engineering.

[19]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[21]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[22]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Christos-Savvas Bouganis,et al.  Novel Cascade FPGA Accelerator for Support Vector Machines Classification , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Xinming Huang,et al.  A system-on-chip FPGA design for real-time traffic signal recognition system , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[25]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[28]  Takeshi Takaki,et al.  Fast FPGA-Based Multiobject Feature Extraction , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[29]  Hao Wang,et al.  Facial expression recognition using iterative fusion of MO-HOG and deep features , 2018, The Journal of Supercomputing.

[30]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[31]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.