A GPU-Accelerated Wavelet Decompression System With SPIHT and Reed-Solomon Decoding for Satellite Images

The discrete wavelet transform (DWT)-based Set Partitioning in Hierarchical Trees (SPIHT) algorithm is widely used in many image compression systems. The time-consuming computation of the 9/7 discrete wavelet decomposition is usually the bottleneck of these systems. In order to perform real-time Reed-Solomon channel decoding and SPIHT+DWT source decoding on a massive bit stream of compressed images continuously down-linked from the satellite, we propose a novel graphic processing unit (GPU)-accelerated decoding system. In this system the GPU is used to compute the time-consuming inverse DWT, while multiple CPU threads are run in parallel for the remaining part of the system. Both CPU and GPU parts were carefully designed to have approximately the same processing speed to obtain the maximum throughput via a novel pipeline structure for processing continuous satellite images. As part of the SPIHT decoding system, the GPU-based inverse DWT is about 158 times faster than its CPU counterpart. Through the pipelined CPU and GPU heterogeneous computing, the entire decoding system approaches a speedup of 83x as compared to its single-threaded CPU counterpart. The proposed channel and source decoding system is able to decompress 1024x1024 satellite images at a speed of 90 frames per second.

[1]  Francisco Tirado,et al.  Parallel Implementation of the 2D Discrete Wavelet Transform on Graphics Processing Units: Filter Bank versus Lifting , 2008, IEEE Transactions on Parallel and Distributed Systems.

[2]  Thomas Ertl,et al.  Hardware Accelerated Wavelet Transformations , 2000, VisSym.

[3]  Roberto Lario,et al.  The 2D Discrete Wavelet Transform on Programmable Graphics Hardware , 2004 .

[4]  Bormin Huang,et al.  Development of a GPU-based high-performance radiative transfer model for the Infrared Atmospheric Sounding Interferometer (IASI) , 2011, J. Comput. Phys..

[5]  Irving S. Reed,et al.  Reed-Solomon Codes , 1999 .

[6]  Nail A. Gumerov,et al.  Fast parallel Particle-To-Grid interpolation for plasma PIC simulations on the GPU , 2008, J. Parallel Distributed Comput..

[7]  A. Said,et al.  Manuscript Submitted to the Ieee Transactions on Circuits and Systems for Video Technology a New Fast and Eecient Image Codec Based on Set Partitioning in Hierarchical Trees , 2007 .

[8]  Jiří Matela GPU-Based DWT Acceleration for JPEG2000 , 2009 .

[9]  Wim Sweldens,et al.  The lifting scheme: a construction of second generation wavelets , 1998 .

[10]  Andrew Chi-Sing Leung,et al.  Discrete Wavelet Transform on Consumer-Level Graphics Hardware , 2007, IEEE Transactions on Multimedia.

[11]  Manuel Ujaldon,et al.  Parallel 3D fast wavelet transform on manycore GPUs and multicore CPUs , 2010, ICCS.

[12]  Erik Lindholm,et al.  NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.

[13]  Nagavijayalakshmi Vydyanathan,et al.  Parallel discrete wavelet transform using the Open Computing Language: a performance and portability study , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[14]  Jos B. T. M. Roerdink,et al.  Accelerating Wavelet Lifting on Graphics Hardware Using CUDA , 2011, IEEE Transactions on Parallel and Distributed Systems.

[15]  Sunil P. Khatri,et al.  Introduction to GPU programming for EDA , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[16]  Ümit V. Çatalyürek,et al.  High-performance signal processing on emerging many-core architectures using cuda , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[17]  Manuel E. Acacio,et al.  A Parallel Implementation of the 2D Wavelet Transform Using CUDA , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.