Parallel implementation of wavelet-based image denoising on programmable PC-grade graphics hardware

The discrete wavelet transform (DWT) has been extensively used for image compression and denoising in the areas of image processing and computer vision. However, the intensive computation of DWT due to its inherent multilevel data decomposition and reconstruction operations brings a bottleneck that drastically reduces its performance and implementations for real-time applications when facing large size digital images and/or high-definition videos. Although various software-based acceleration solutions, such as the lifting scheme, have been devised and achieved a higher performance in general, the pure software accelerated DWT still struggle to cope with the demands from real-time and interactive applications. With the growing capacity and popularity of graphics hardware, personal computers (PCs) nowadays are often equipped with programmable graphics processing units (GPUs) for graphics acceleration. The GPU offers a cost-effective parallel data processing mechanism for operations on large amount of data, even for applications beyond graphics. This practice is commonly referred as general-purpose computing on GPU (GPGPU). This paper presented a GPGPU framework with the corresponding parallel computing solution for wavelet-based image denoising by using off-the-shelf consumer-grade programmable GPUs. This framework can be readily incorporated with different forms of DWT by customizing the parameter of the wavelet kernel. Experiment results show that the framework gains applicability in data parallelism and satisfaction performance in accelerating computations for wavelet-based denoising.

[1]  Truong Q. Nguyen,et al.  Wavelets and filter banks , 1996 .

[2]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[3]  Kai Schneider,et al.  Nonlinear wavelet thresholding: A recursive method to determine the optimal denoising threshold , 2005 .

[4]  William R. Mark,et al.  Cg: a system for programming graphics hardware in a C-like language , 2003, ACM Trans. Graph..

[5]  Wei Li,et al.  A VLSI architecture for discrete wavelet transform , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[6]  I. Johnstone,et al.  Minimax estimation via wavelet shrinkage , 1998 .

[7]  P.N.T. Wells,et al.  Handbook of Image and Video Processing , 2001 .

[8]  Mary Jane Irwin,et al.  VLSI architectures for the discrete wavelet transform , 1995 .

[9]  Alan C. Bovik,et al.  Handbook of Image and Video Processing (Communications, Networking and Multimedia) , 2005 .

[10]  Kenneth Moreland,et al.  The FFT on a GPU , 2003, HWWS '03.

[11]  Robert Strzodka,et al.  Scientific computation for simulations on programmable graphics hardware , 2005, Simul. Model. Pract. Theory.

[12]  I. Johnstone,et al.  Wavelet Shrinkage: Asymptopia? , 1995 .

[13]  Markus Hadwiger,et al.  Real-time volume graphics , 2006, Eurographics.

[14]  Jerry D. Gibson,et al.  Handbook of Image and Video Processing , 2000 .

[15]  P. Massart,et al.  From Model Selection to Adaptive Estimation , 1997 .

[16]  T. Cai,et al.  Block thresholding for density estimation: local and global adaptivity , 2005 .

[17]  G. Knowles VLSI architecture for the discrete wavelet transform , 1990 .

[18]  P. Massart,et al.  Risk bounds for model selection via penalization , 1999 .

[19]  Andrew Chi-Sing Leung,et al.  Discrete Wavelet Transform on Consumer-Level Graphics Hardware , 2007, IEEE Transactions on Multimedia.

[20]  I. Johnstone,et al.  Adapting to Unknown Smoothness via Wavelet Shrinkage , 1995 .

[21]  Matt Pharr,et al.  Gpu gems 2: programming techniques for high-performance graphics and general-purpose computation , 2005 .

[22]  J. Krüger,et al.  Linear algebra operators for GPU implementation of numerical algorithms , 2003, ACM Trans. Graph..

[23]  Koichi Kuzume,et al.  FPGA-based lifting wavelet processor for real-time signal detection , 2004, Signal Process..

[24]  Thomas Ertl,et al.  Hardware Accelerated Wavelet Transformations , 2000, VisSym.

[25]  Yun Zhang,et al.  Wavelet based image fusion techniques — An introduction, review and comparison , 2007 .

[26]  Francisco Tirado,et al.  Parallel Implementation of the 2D Discrete Wavelet Transform on Graphics Processing Units: Filter Bank versus Lifting , 2008, IEEE Transactions on Parallel and Distributed Systems.

[27]  Wim Sweldens,et al.  The lifting scheme: a construction of second generation wavelets , 1998 .

[28]  Naga K. Govindaraju,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007 .