Efficient optimization approach for fast GPU computation of Zernike moments

Abstract Our study focuses on accelerating the computation of Zernike moments on graphics processing units (GPUs). There are two ideas to achieve the goal. First is to implement a novel re-layout that involves reordering the image pixels and addressing the diagonal pixels in advance, so that computations of all pixels are allocated to an octant effectively. Second is to the leverage the constant memory to store precomputed values used across GPU threads. An in-depth study has been carried out to evaluate the performance in each case and to compare against GPU implementation of other algorithms and to discuss the bottleneck. The result shows that our approach is effective and achieves significant performance improvement compared to other GPU state-of-the-art implementations. Furthermore, our approach is suited for allocating the data flow into multiple GPUs.

[1]  Paul Rosen,et al.  A Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernels , 2013, Comput. Graph. Forum.

[2]  C. Singh,et al.  Face recognition using Zernike and complex Zernike moment features , 2011, Pattern Recognition and Image Analysis.

[3]  K. R. Ramakrishnan,et al.  Fast computation of Legendre and Zernike moments , 1995, Pattern Recognit..

[4]  Manuel Ujaldon GPU acceleration of Zernike moments for large-scale images , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[5]  Chee-Way Chong,et al.  A comparative analysis of algorithms for fast computation of Zernike moments , 2003, Pattern Recognit..

[6]  Dimitris A. Karras,et al.  A new class of Zernike moments for computer vision applications , 2007, Inf. Sci..

[7]  Chun-Yuan Lin,et al.  Constructing a GPU cluster platform based on multiple NVIDIA Jetson TK1 , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[8]  Pablo Moscato,et al.  Efficient data partitioning for the GPU computation of moment functions , 2014, J. Parallel Distributed Comput..

[9]  Stephanie Schuckers,et al.  Key-Frame Analysis for Face Related Video on GPU-Accelerated Embedded Platform , 2016, 2016 International Conference on Computational Science and Computational Intelligence (CSCI).

[10]  Shan Li,et al.  Complex Zernike Moments Features for Shape-Based Image Retrieval , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[11]  Li Xuan,et al.  Wavefront processor for liquid crystal adaptive optics system based on Graphics Processing Unit , 2014 .

[12]  Reinhard Klein,et al.  Shape retrieval using 3D Zernike descriptors , 2004, Comput. Aided Des..

[13]  S. M. Elshoura,et al.  Analysis of noise sensitivity of Tchebichef and Zernike moments with application to image watermarking , 2013, J. Vis. Commun. Image Represent..

[14]  Laura Waller,et al.  Video-rate processing in tomographic phase microscopy of biological cells using CUDA. , 2016, Optics express.

[15]  Rahul Upneja,et al.  Fast and accurate method for high order Zernike moments computation , 2012, Appl. Math. Comput..

[16]  Eric C. Kintner,et al.  On the Mathematical Properties of the Zernike Polynomials , 1976 .

[17]  Heung-Kyu Lee,et al.  Invariant image watermark using Zernike moments , 2003, IEEE Trans. Circuits Syst. Video Technol..

[18]  Huazhong Shu,et al.  A novel algorithm for fast computation of Zernike moments , 2002, Pattern Recognit..

[19]  Bormin Huang,et al.  GPU Compute Unified Device Architecture (CUDA)-based Parallelization of the RRTMG Shortwave Rapid Radiative Transfer Model , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[20]  Jan Flusser,et al.  Near infrared face recognition by combining Zernike moments and undecimated discrete wavelet transform , 2014, Digit. Signal Process..

[21]  Prasad K. Yarlagadda,et al.  a face recognition approach using zernike moments for video surveillance , 2007 .

[22]  Pablo Toharia,et al.  Shot boundary detection using Zernike moments in multi-GPU multi-CPU architectures , 2012, J. Parallel Distributed Comput..

[23]  von F. Zernike Beugungstheorie des schneidenver-fahrens und seiner verbesserten form, der phasenkontrastmethode , 1934 .

[24]  A. Prata,et al.  Algorithm for computation of Zernike polynomials expansion coefficients. , 1989, Applied optics.

[25]  Whoi-Yul Kim,et al.  A novel approach to the fast computation of Zernike moments , 2006, Pattern Recognit..

[26]  Chun-Wei Tan,et al.  Accurate Iris Recognition at a Distance Using Stabilized Iris Encoding and Zernike Moments Phase Features , 2014, IEEE Transactions on Image Processing.

[27]  Miroslaw Pawlak,et al.  Image Reconstruction with Polar Zernike Moments , 2005, ICAPR.

[28]  Youssef Chahir,et al.  Unified framework for human behaviour recognition: An approach using 3D Zernike moments , 2013, Neurocomputing.

[29]  Pooja,et al.  Improving image retrieval using combined features of Hough transform and Zernike moments , 2011 .

[30]  Chandan Singh Improved quality of reconstructed images using floating point arithmetic for moment calculation , 2006, Pattern Recognit..

[31]  Ekta Walia,et al.  Rotation invariant complex Zernike moments features and their applications to human face and character recognition , 2011 .

[32]  Whoi-Yul Kim,et al.  Local Descriptor by Zernike Moments for Real-Time Keypoint Matching , 2008, 2008 Congress on Image and Signal Processing.

[33]  Jack J. Purdum,et al.  C programming guide , 1983 .

[34]  Qinru Qiu,et al.  Effective Utilization of CUDA Hyper-Q for Improved Power and Performance Efficiency , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[35]  M. Teague Image analysis via the general theory of moments , 1980 .

[36]  Whoi-Yul Kim,et al.  A region-based shape descriptor using Zernike moments , 2000, Signal Process. Image Commun..

[37]  Whoi-Yul Kim,et al.  Fast and efficient method for computing ART , 2006, IEEE Transactions on Image Processing.

[38]  Chih-Ying Gwo,et al.  Stable, fast computation of high-order Zernike moments using a recursive method , 2016, Pattern Recognit..

[39]  David R. Kaeli,et al.  Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures , 2011, IEEE Transactions on Parallel and Distributed Systems.

[40]  B. Yilbas,et al.  Three-dimensional consideration of jet impingement onto the kerf in relation to laser cutting process: Effect of jet velocity on heat transfer rates , 2011 .