Quality and Power Efficient Architecture for the Discrete Cosine Transform

In recent years, the demand for multimedia mobile battery-operated devices has created a need for low power implementation of video compression. Many compression standards require the discrete cosine transform (DCT) function to perform image/video compression. For this reason, low power DCT design has become more and more important in today's image/video processing. This paper presents a new power-efficient Hybrid DCT architecture which combines Loeffler DCT and binDCT in terms of special property on luminance and chrominance difference. We use Synopsys PrimePower to estimate the power consumption in a TSMC 0.25-μm technology. Besides, we also adopt a novel quality assessment method based on structural distortion measurement to measure the quality instead of peak signal to noise rations (PSNR) and mean squared error (MSE). It is concluded that our Hybrid DCT offers similar quality performance to the Loeffler, and leads to 25% power consumption and 27% chip area savings.

[1]  A. J. Al-Khalili,et al.  Low-power data-dependent 8/spl times/8 DCT/IDCT for video compression , 2003 .

[2]  Bede Liu,et al.  A new hardware realization of digital filters , 1974 .

[3]  Zhou Wang,et al.  Video quality assessment using structural distortion measurement , 2002, Proceedings. International Conference on Image Processing.

[4]  T. Tran,et al.  The binDCT: fast multiplierless approximation of the DCT , 2000, IEEE Signal Processing Letters.

[5]  Iain E. G. Richardson,et al.  Video CODEC Design , 2002 .

[6]  Wen-Hsiung Chen,et al.  A Fast Computational Algorithm for the Discrete Cosine Transform , 1977, IEEE Trans. Commun..

[7]  G.S. Moschytz,et al.  Practical fast 1-D DCT algorithms with 11 multiplications , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[8]  Naoki Suehiro,et al.  Fast algorithms for the DFT and other sinusoidal transforms , 1986, IEEE Trans. Acoust. Speech Signal Process..

[9]  Patrick C. Teo,et al.  Perceptual image distortion , 1994, Proceedings of 1st International Conference on Image Processing.

[10]  Peter A. Beerel,et al.  A high-performance low-power asynchronous matrix-vector multiplier for discrete cosine transform , 1999, AP-ASIC'99. First IEEE Asia Pacific Conference on ASICs (Cat. No.99EX360).

[11]  Zhou Wang,et al.  Why is image quality assessment so difficult? , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Trac D. Tran A fast multiplierless block transform for image and video compression , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[13]  S.A. White,et al.  Applications of distributed arithmetic to digital signal processing: a tutorial review , 1989, IEEE ASSP Magazine.

[14]  Kyu Tae Park,et al.  Fast DCT algorithm with fewer multiplication stages , 1998 .

[15]  Kaushik Roy,et al.  Low power reconfigurable DCT design based on sharing multiplication , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  Trac D. Tran,et al.  Fast multiplierless approximations of the DCT with the lifting scheme , 2001, IEEE Trans. Signal Process..

[17]  M. Vetterli,et al.  Simple FFT and DCT algorithms with reduced number of operations , 1984 .

[18]  Zhongde Wang Fast algorithms for the discrete W transform and for the discrete Fourier transform , 1984 .

[19]  Dong Sam Ha,et al.  Low power design of DCT and IDCT for low bit rate video codecs , 2004, IEEE Transactions on Multimedia.

[20]  Jun Rim Choi,et al.  A compatible DCT/IDCT architecture using hardwired distributed arithmetic , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[21]  Paul S. Fisher,et al.  Image quality measures and their performance , 1995, IEEE Trans. Commun..

[22]  K.K. Parhi,et al.  Power comparison of flow-graph and distributed arithmetic based DCT architectures , 1998, Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284).

[23]  Hsieh S. Hou A fast recursive algorithm for computing the discrete cosine transform , 1987, IEEE Trans. Acoust. Speech Signal Process..

[24]  Magdy A. Bayoumi,et al.  A low power high performance distributed DCT architecture , 2002, Proceedings IEEE Computer Society Annual Symposium on VLSI. New Paradigms for VLSI Systems Design. ISVLSI 2002.

[25]  Zhou Wang,et al.  Video quality assessment based on structural distortion measurement , 2004, Signal Process. Image Commun..

[26]  A. Chandrakasan,et al.  A low-power DCT core using adaptive bitwidth and arithmetic activity exploiting signal correlations and quantization , 1999, IEEE Journal of Solid-State Circuits.

[27]  Shih-Lien Lu,et al.  Low power design of two-dimensional DCT , 1996, Proceedings Ninth Annual IEEE International ASIC Conference and Exhibit.

[28]  Jinsang Kim,et al.  Low-power multiplierless DCT architecture using image correlation , 2004, IEEE Trans. Consumer Electron..

[29]  Luca Fanucci,et al.  Data driven VLSI computation for low power DCT-based video coding , 2002, 9th International Conference on Electronics, Circuits and Systems.

[30]  B. Lee A new algorithm to compute the discrete cosine Transform , 1984 .

[31]  Mohamed I. Elmasry,et al.  Low-power implementation of discrete cosine transform , 1996, Proceedings of the Sixth Great Lakes Symposium on VLSI.