Complexity Reduction by Modified Scale-Space Construction in SIFT Generation Optimized for a Mobile GPU

Scale-invariant feature transform (SIFT) is one of the most widely used local features for computer vision in mobile devices. A mobile graphic processing unit (GPU) is often used to run computer-vision applications using SIFT features, but the performance in such a case is not powerful enough to generate SIFT features in real time. This paper proposes an efficient scheme to optimize the SIFT algorithm for a mobile GPU. It analyzes the conventional scale-space construction step in the SIFT generation, finding that reducing the size of the Gaussian filter and the scale-space image leads to a significant speedup with only a slight degradation of the quality of the features. Based on this observation, the SIFT algorithm is modified and implemented for real-time execution. Additional optimization techniques are employed for a further speedup by efficiently utilizing both the CPU and the GPU in a mobile processor. The proposed SIFT generation scheme achieves a processing speed of 28.30 frames/s for an image with a resolution of $1280 \times 720$ running on a Galaxy S5 LTE-A device, thereby gaining a speedup by the factors of 114.78 and 4.53 over CPU- and GPU-only implementations, respectively.

[1]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[2]  Changchang Wu,et al.  SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[3]  Sebastiano Battiato,et al.  SIFT Features Tracking for Video Stabilization , 2007, 14th International Conference on Image Analysis and Processing (ICIAP 2007).

[4]  Yung-Chang Chen,et al.  High-Performance SIFT Hardware Accelerator for Real-Time Image Feature Extraction , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  Joseph R. Cavallaro,et al.  Workload analysis and efficient OpenCL-based implementation of SIFT algorithm on a smartphone , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[6]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[7]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[8]  Robin Hess An Open-Source SIFT Library , 2010 .

[9]  Joseph R. Cavallaro,et al.  A fast and efficient sift detector using the mobile GPU , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Hyuk-Jae Lee,et al.  A novel hardware design for SIFT generation with reduced memory requirement , 2013 .

[11]  Guangjun Zhang,et al.  SIFT Hardware Implementation for Real-Time Image Feature Extraction , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Jan-Michael Frahm,et al.  Feature tracking and matching in video using programmable graphics hardware , 2007, Machine Vision and Applications.

[13]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[14]  Wenquan Feng,et al.  An architecture of optimised SIFT feature detection for an FPGA implementation of an image matcher , 2009, 2009 International Conference on Field-Programmable Technology.

[15]  Tian-Sheuan Chang,et al.  Fast SIFT Design for Real-Time Visual Feature Extraction , 2013, IEEE Transactions on Image Processing.

[16]  Thomas Wiegand,et al.  SIFT Implementation and Optimization for General-Purpose GPU , 2007 .

[17]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[18]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[20]  Hyuk-Jae Lee,et al.  A Novel Hardware Architecture With Reduced Internal Memory for Real-Time Extraction of SIFT in an HD Video , 2016, IEEE Transactions on Circuits and Systems for Video Technology.