A scalable embedded JPEG 2000 architecture

The latest image compression standard, JPEG 2000 is well tuned for diverse applications, thus raising various throughput demands on its building blocks. Therefore, a JPEG 2000 encoder with the feature of scalability is favorable for its ability of meeting different throughput requirements. On the other hand, the large amounts of data streams underline the importance of bandwidth optimization in designing the encoder. The initial specification, especially in terms of loop organization and array indices, describes the data manipulations and, subsequently, influences the outcome of the architecture implementation. Therefore, there is a clear need for the exploiting support, and we believe the emphasis should lie on the loop level steering. In this paper, we apply loop transformation techniques to a scalable embedded JPEG 2000 encoder design during the architectural exploration stage, considering not only the balance of throughput among different blocks, but also the reduction of data transfer. The architecture is prototyped onto Xilinx FPGA.

[1]  Monica S. Lam,et al.  A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..

[2]  Liang-Gee Chen,et al.  A Programmable Parallel VLSI Architecture for 2-D Discrete Wavelet Transform , 2001, J. VLSI Signal Process..

[3]  Betty H. C. Cheng,et al.  Generalizing the Unimodular Approach , 1994, ICPADS.

[4]  Fadi J. Kurdahi,et al.  "Software-pipelined" 2-D discrete wavelet transform with VLSI hierarchical implementation , 2003, IEEE International Conference on Robotics, Intelligent Systems and Signal Processing, 2003. Proceedings. 2003.

[5]  I. Daubechies,et al.  Factoring wavelet transforms into lifting steps , 1998 .

[6]  Liang-Gee Chen,et al.  Analysis and architecture design of EBCOT for JPEG-2000 , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[7]  Marco Mattavelli,et al.  A scalable and programmable architecture for 2-D DWT decoding , 2002, IEEE Trans. Circuits Syst. Video Technol..

[8]  Fadi J. Kurdahi,et al.  A high-performance parallel mode EBCOT encoder architecture design for JPEG2000 , 2004, IEEE International SOC Conference, 2004. Proceedings..

[9]  Utpal Banerjee,et al.  Loop Transformations for Restructuring Compilers: The Foundations , 1993, Springer US.

[10]  Zhiliang Hong,et al.  Low memory and low complexity VLSI implementation of JPEG2000 codec , 2004, IEEE Trans. Consumer Electron..

[11]  Jen-Shiun Chiang,et al.  Efficient pass-parallel architecture for EBCOT in JPEG2000 , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[12]  Chaitali Chakrabarti,et al.  A high-performance JPEG2000 architecture , 2003, IEEE Trans. Circuits Syst. Video Technol..

[13]  Chaitali Chakrabarti,et al.  A VLSI architecture for lifting-based forward and inverse wavelet transform , 2002, IEEE Trans. Signal Process..

[14]  Betty H. C. Cheng,et al.  Generalising the unimodular approach [program code transformation] , 1994, Proceedings of 1994 International Conference on Parallel and Distributed Systems.