Three-level parallel-set partitioning in hierarchical trees coding based on the collaborative CPU and GPU for remote sensing images compression

Abstract. To accelerate the massive remote sensing images (RSIs) coding in a ground service-oriented remote sensing system, this study proposes three-level (i.e., tree-level, bit-plane level, and byte-level) parallel-set partitioning in hierarchical trees (TP-SPIHT) coding on a collaborative central and graphic processing unit (CPU and GPU) to parallelize SPIHT by optimizing its dynamic processing with the linked list. Basic parallel SPIHT coding is presented with preprocessing, tree-level parallel coding, and bit-stream organization using three kinds of static marker matrices instead of the dynamic linked lists originally used to remove the data dependency of the original SPIHT. The bit-stream organization is implemented on CPU and other processes are implemented on GPU using GPU streams. The bit-stream organization is further divided into a bit-plane level parallel bit-plane stream extraction and a final bit-stream organization on a multicore CPU. Because no dependencies exist between the different byte operations in the final bit-stream organization, this organization is accelerated by byte-level parallelization on the GPU. Experimental results with different sized RSIs show that TP-SPIHT takes 292.03 ms to code a 2048×2048 image and achieves a 6.27 times speedup compared with an optimized CPU implementation. The speedup ratio improves as the image increases from 256×256 to 2048×2048.

[1]  Jos B. T. M. Roerdink,et al.  Accelerating Wavelet Lifting on Graphics Hardware Using CUDA , 2011, IEEE Transactions on Parallel and Distributed Systems.

[2]  Manuel Ujaldon,et al.  The 2D wavelet transform on emerging architectures: GPUs and multicores , 2011, Journal of Real-Time Image Processing.

[3]  Scott Hauck,et al.  SPIHT image compression on FPGAs , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Marco Lanuzza,et al.  Low bit rate image compression core for onboard space applications , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[5]  R. M. Banakar,et al.  Throughput Efficient Parallel Implementation of SPIHT Algorithm , 2008, 21st International Conference on VLSI Design (VLSID 2008).

[6]  William A. Pearlman,et al.  Efficient, low-complexity image coding with a set-partitioning embedded block coder , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Francisco Tirado,et al.  Parallel Implementation of the 2D Discrete Wavelet Transform on Graphics Processing Units: Filter Bank versus Lifting , 2008, IEEE Transactions on Parallel and Distributed Systems.

[8]  Antonio Plaza,et al.  Graphics processing unit implementation of JPEG2000 for hyperspectral image compression , 2012 .

[9]  Jerome M. Shapiro,et al.  Embedded image coding using zerotrees of wavelet coefficients , 1993, IEEE Trans. Signal Process..

[10]  Wei Wu,et al.  Exploiting Parallelism by Data Dependency Elimination: A Case Study of Circuit Simulation Algorithms , 2013, IEEE Design & Test.

[11]  Petr Holub,et al.  Efficient JPEG2000 EBCOT Context Modeling for Massively Parallel Architectures , 2011, 2011 Data Compression Conference.

[12]  Kai Liu,et al.  VLSI Architecture of Arithmetic Coder Used in SPIHT , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[13]  Yunsong Li,et al.  Block-based two-dimensional wavelet transform running on graphics processing unit , 2014, IET Comput. Digit. Tech..

[14]  Hyuk-Jae Lee,et al.  A Block-Based Pass-Parallel SPIHT Algorithm , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  William A. Pearlman,et al.  SPIHT image compression without lists , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[16]  Chulhee Lee,et al.  Constant coefficients linear prediction for lossless compression of ultraspectral sounder data using a graphics processing unit , 2010 .

[17]  William A. Pearlman,et al.  A new, fast, and efficient image codec based on set partitioning in hierarchical trees , 1996, IEEE Trans. Circuits Syst. Video Technol..

[18]  Long-xu Jin,et al.  An improved fast parallel SPIHT algorithm and its FPGA implementation , 2010, 2010 2nd International Conference on Future Computer and Communication.

[19]  David S. Taubman,et al.  High performance scalable image compression with EBCOT , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[20]  Yong Fang,et al.  Parallel design of JPEG-LS encoder on graphics processing units , 2012 .

[21]  Ning Zhang,et al.  Image compression algorithm of high-speed SPIHT for aerial applications , 2011, 2011 IEEE 3rd International Conference on Communication Software and Networks.

[22]  Enrico Magli,et al.  Lossy hyperspectral image compression on a graphics processing unit: parallelization strategy and performance evaluation , 2013 .

[23]  Li-Minn Ang,et al.  A dataflow-oriented VLSI architecture for a modified SPIHT algorithm using depth-first search bit stream processing , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[24]  Enrico Magli,et al.  Highly-Parallel GPU Architecture for Lossy Hyperspectral Image Compression , 2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[25]  Yunsong Li,et al.  A GPU-Accelerated Wavelet Decompression System With SPIHT and Reed-Solomon Decoding for Satellite Images , 2011, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[26]  Hyuk-Jae Lee,et al.  Pixel-Parallel SPIHT for frame memory compression , 2009, 2009 IEEE International SOC Conference (SOCC).

[27]  Peijun Du,et al.  Fusion and classification of Beijing-1 small satellite remote sensing image for land cover monitoring in mining area , 2011 .

[28]  Kai Liu,et al.  A High Speed VLSI Architecture of SPIHT without Lists for Real-Time Applications , 2010, 2010 6th International Conference on Wireless Communications Networking and Mobile Computing (WiCOM).

[29]  Hao Chen,et al.  GPU acceleration of simplex volume algorithm for hyperspectral endmember extraction , 2012, Other Conferences.

[30]  Hao Chen,et al.  Parallel Acceleration of SAM Algorithm and Performance Analysis , 2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[31]  Melin Huang,et al.  Efficient Parallel GPU Design on WRF Five-Layer Thermal Diffusion Scheme , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[32]  D. Keymeulen,et al.  GPU lossless hyperspectral data compression system for space applications , 2012, 2012 IEEE Aerospace Conference.