SURFEX: A 57fps 1080P resolution 220mW silicon implementation for simplified speeded-up robust feature with 65nm process

Speeded Up Robust Feature(SURF) is widely used in computer vision applications. In many recent applications like mobile devices and vision sensor network, it is extremely difficult to meet both the performance and power consumption requirements of SURF implementations, especially for CPU, GPU, DSP or FPGA based solutions. In this paper, the SURF algorithm is simplified and optimized for hardware implementation. To increase the throughput, procedures like orientation assignment and descriptor extraction are re-organized while maintaining enough accuracy; the memory accesses have also been improved to increase the bandwidth and reduce repeated data accesses; the workload of each stage in the pipeline is analyzed and balanced to reduce the pipeline bubble. Furthermore, a method called Word Length Reduction (WLR) is adopted to compress the integral image, which reduces the on-chip memory by 40%. In addition to that, the corresponding power consumptions are reduced significantly. The Simplified SURF is implemented onto a 3.4×4.0 mm2 chip called SURFEX using TSMC 65nm process. The chip is able to process 57 frames of 1080p(1920×1080) video per second with a 200MHz working frequency while dissipating 220mW. This throughput is 6 times of the ones reported in the latest literatures and the power consumption is less than half of the most outstanding implementations.

[1]  George A. Constantinides,et al.  A Parallel Hardware Architecture for Scale and Rotation Invariant Feature Detection , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Libor Preucil,et al.  FPGA based Speeded Up Robust Features , 2009, 2009 IEEE International Conference on Technologies for Practical Robot Applications.

[3]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[4]  Harm J. W. Belt Word length reduction for the integral image , 2008, 2008 15th IEEE International Conference on Image Processing.

[5]  Hoi-Jun Yoo,et al.  A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine , 2009, IEEE Journal of Solid-State Circuits.

[6]  Alexandrina Rogozan,et al.  Combining SURF-based local and global features for road obstacle recognition in far infrared images , 2010, 13th International IEEE Conference on Intelligent Transportation Systems.

[7]  Wenquan Feng,et al.  An architecture of optimised SIFT feature detection for an FPGA implementation of an image matcher , 2009, 2009 International Conference on Field-Programmable Technology.

[8]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  Anna Wang,et al.  Research on a novel non-rigid registration for medical image based on SURF and APSO , 2010, 2010 3rd International Congress on Image and Signal Processing.

[10]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[11]  Matthew Turk,et al.  Car-Rec: A real time car recognition system , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[12]  Bing Han,et al.  Fast calculating feature point's main orientation in SURF algorithm , 2010, 2010 International Conference on Computer, Mechatronics, Control and Electronic Engineering.

[13]  Ioannis Papaefstathiou,et al.  Fast and Efficient FPGA-Based Feature Detection Employing the SURF Algorithm , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.