A fast integral image generation algorithm on GPUs

Integral image, also known as summed area table is a two-dimensional table generated from an input image. Each entry in the table stores the sum of all pixels which locate on the top-left corner of the entry in the input image. Integral image is a very popular and important algorithm in computer vision and computer graphics applications. Especially in real-time computer vision, it is usually used to accelerate calculating the sum of a rectangular area. Integral image algorithm is memory-bounded. There are two typical existed image integral algorithms on GPUs. The first is the Scan-Scan algorithm. The second is the Scan-Transpose-Scan algorithm, which adopts three steps to generate the integral image. The first and the third steps are scan. In order to achieve coalesced global memory access in the third step, a transpose step is added. In this paper, we propose a novel blocked integral algorithm, which has three stages. The first stage is intra-block reduction. The second stage is auxiliary matrix scan and the third stage is intra-block scan. Compared with the Scan-Scan algorithm, our proposed scheme reduces the global memory accesses. At the same time, less local synchronizations and less load imbalance are achieved. Compared with the Scan-Transpose-Scan algorithm, our proposed algorithm only needs about half of the global memory accesses. At the same time, coalesced memory access is achieved. We implemented these three algorithms with OpenCL so that they can run on both Nvidia and AMD GPUs. We also designed an auto-tuning framework to search optimal parameters for different size of input matrix on those two platforms. The experiment result shows that our proposed algorithm gets the best performance compared with the two existed typical integral algorithms.

[1]  Derek Bradley,et al.  Adaptive Thresholding using the Integral Image , 2007, J. Graph. Tools.

[2]  Fatih Murat Porikli,et al.  Fast Construction of Covariance Matrices for Arbitrary Size Image Windows , 2006, 2006 International Conference on Image Processing.

[3]  Alexander Toet,et al.  Speed-up Template Matching through Integral Image based Weak Classifiers , 2014 .

[4]  Shengen Yan,et al.  StreamScan: fast scan algorithms for GPUs without global barrier synchronization , 2013, PPoPP '13.

[5]  Feiniu Yuan,et al.  A fast accumulative motion orientation model based on integral image for video smoke detection , 2008, Pattern Recognit. Lett..

[6]  Larry S. Davis,et al.  Kernel integral images: A framework for fast non-uniform filtering , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Franklin C. Crow,et al.  Summed-area tables for texture mapping , 1984, SIGGRAPH.

[8]  Olga Veksler,et al.  Fast variable window for stereo correspondence using integral images , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  Wesley E. Snyder,et al.  Stacked Integral Image , 2010, 2010 IEEE International Conference on Robotics and Automation.

[10]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[11]  John D. Owens,et al.  A Work-Efficient Step-Efficient Prefix Sum Algorithm , 2006 .

[12]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[13]  Wei Huang,et al.  GPU-Based Computation of the Integral Image , 2011, 2011 International Conference on Virtual Reality and Visualization.

[14]  Mei-Chen Yeh,et al.  Fast Human Detection Using a Cascade of Histograms of Oriented Gradients , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  H. T. Kung,et al.  A Regular Layout for Parallel Adders , 1982, IEEE Transactions on Computers.

[16]  Ichiro Masaki,et al.  Efficient integral image computation on the GPU , 2010, 2010 IEEE Intelligent Vehicles Symposium.

[17]  Bohyung Han,et al.  Bayesian Filtering and Integral Image for Visual Tracking , 2005 .

[18]  Simone Frintrop,et al.  A Real-time Visual Attention System Using Integral Images , 2007, ICVS 2007.