A Fast and Power-Efficient Hardware Architecture for Visual Feature Detection in Affine-SIFT

Visual feature detection has been widely used in many computer vision applications, with increasing concern on feature robustness, processing speed, and power efficiency. In comparison with popular feature detection algorithms, affine-SIFT achieves the strongest robustness on the image illumination, image rotation, and image scale transformation, but exhibits extreme high computation complexity. To improve its computing efficiency, this work first proposes three hardware optimization methods to address three main performance bottlenecks. The first method is the reverse affine-based pipelined computing with optimized memory accessing. The second method is about stream processing with full parallel Gaussian pyramid. The third method is the rotation invariant binary pattern based feature vector generation. Then by incorporating these three optimization methods, this paper designs a high-efficient pipelined and parallel hardware architecture with optimized parallel memory accessing. Postlayout simulations using TSMC 65-nm 1P9M low power process show that this work achieves a processing speed of 97 fps at 1080p (1000 feature points per frame on average) under 200 MHz, with power consumption at 300 mW. In comparison, its computing efficiency (1005.6K pixels/s at 1 MHz) and power efficiency (670.5K pixels/s at 1 mW) are higher than state-of-the-art works and it is more promising for broad vision applications especially the embedded vision and mobile vision applications.

[1]  Giovanni Maria Farinella,et al.  Affine Covariant Features for Fisheye Distortion Local Modeling , 2017, IEEE Transactions on Image Processing.

[2]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[3]  Guangjun Zhang,et al.  SIFT Hardware Implementation for Real-Time Image Feature Extraction , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Adrien Bartoli,et al.  KAZE Features , 2012, ECCV.

[5]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Hakil Kim,et al.  A fast feature extraction in object recognition using parallel processing on CPU and GPU , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[7]  Dimitris N. Metaxas,et al.  Saliency-based rotation invariant descriptor for wrist detection in whole body CT images , 2014, 2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI).

[8]  Youchang Kim,et al.  A 2.71 nJ/Pixel Gaze-Activated Object Recognition System for Low-Power Mobile Smart Glasses , 2016, IEEE Journal of Solid-State Circuits.

[9]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[10]  Chenchen Deng,et al.  SURFEX: A 57fps 1080P resolution 220mW silicon implementation for simplified speeded-up robust feature with 65nm process , 2013, Proceedings of the IEEE 2013 Custom Integrated Circuits Conference.

[11]  Hoi-Jun Yoo,et al.  A 320 mW 342 GOPS Real-Time Dynamic Object Recognition Processor for HD 720p Video Streams , 2013, IEEE Journal of Solid-State Circuits.

[12]  Aamir S. Malik,et al.  A framework for real time indoor robot navigation using Monte Carlo Localization and ORB feature detection , 2014, The 18th IEEE International Symposium on Consumer Electronics (ISCE 2014).

[13]  Dionysios I. Reisis,et al.  Conflict-Free Parallel Memory Accessing Techniques for FFT Architectures , 2008, IEEE Transactions on Circuits and Systems I: Regular Papers.

[14]  Nurfitri Anbarsanti,et al.  Multi-object tracking and detection system based on feature detection of the intelligent transportation system , 2014, 2014 IEEE 4th International Conference on System Engineering and Technology (ICSET).

[15]  Hoi-Jun Yoo,et al.  Intelligent Network-on-Chip With Online Reinforcement Learning for Portable HD Object Recognition Processor , 2014, IEEE Transactions on Circuits and Systems I: Regular Papers.

[16]  Yung-Chang Chen,et al.  High-Performance SIFT Hardware Accelerator for Real-Time Image Feature Extraction , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[17]  Hyuk-Jae Lee,et al.  A Novel Hardware Architecture With Reduced Internal Memory for Real-Time Extraction of SIFT in an HD Video , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Leibo Liu,et al.  A 83fps 1080P resolution 354 mW silicon implementation for computing the improved robust feature in affine space , 2015, 2015 IEEE Custom Integrated Circuits Conference (CICC).

[19]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[20]  Adrien Bartoli,et al.  Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces , 2013, BMVC.

[21]  Jean-Michel Morel,et al.  ASIFT: A New Framework for Fully Affine Invariant Image Comparison , 2009, SIAM J. Imaging Sci..

[22]  Hoi-Jun Yoo,et al.  A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine , 2009, IEEE Journal of Solid-State Circuits.

[23]  Yen-Cheng Kuan,et al.  A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things , 2017, IEEE Transactions on Circuits and Systems I: Regular Papers.

[24]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[25]  Leibo Liu,et al.  A 127 fps in full HD accelerator based on optimized AKAZE with efficiency and effectiveness for image feature extraction , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[26]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  Jos B. T. M. Roerdink,et al.  GPU-ASIFT: A fast fully affine-invariant feature extraction algorithm , 2013, 2013 International Conference on High Performance Computing & Simulation (HPCS).

[28]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.