Low power and fast DCT architecture using multiplier-less method

In this paper, a low power and fast DCT (Discrete Cosine Transform) using multiplier-less method is presented with a new modified FGA (Flow-Graph Algorithm), which is derived from our previously presented FGA of DCT based on Loeffler algorithm. The multiplier-less method is based on the replacement of multiplications with a minimum number of additions and shifts. The proposed FGA is performed and compared to a previous one. The results of FPGA implementations on Altera Cyclone II show the increase of the maximum frequency, the decrease of the resources usage and the reduction of the dynamic power by 7.2 % at 120 MHz of clock frequency with a new proposed FGA algorithm. Another comparison with recent published results has been done and proves the efficiency of the proposed FGA.

[1]  Shanq-Jang Ruan,et al.  Low-power and high-quality Cordic-based Loeffler DCT for signal processing , 2007, IET Circuits Devices Syst..

[2]  Om Prakash Gangwal,et al.  Design of a 2D DCT/IDCT application specific VLIW processor supporting scaled and sub-sampled blocks , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[3]  Sotirios G. Ziavras,et al.  Low-power multiplierless DCT for image/video coders , 2009, 2009 IEEE 13th International Symposium on Consumer Electronics.

[4]  Chein-Wei Jen,et al.  A simple processor core design for DCT/IDCT , 2000, IEEE Trans. Circuits Syst. Video Technol..

[5]  Jinsang Kim,et al.  Low-power multiplierless DCT architecture using image correlation , 2004, IEEE Trans. Consumer Electron..

[6]  Zhongfeng Wang,et al.  An improved scaled DCT architecture , 2009, IEEE Transactions on Consumer Electronics.

[7]  Dong Sam Ha,et al.  Low power design of DCT and IDCT for low bit rate video codecs , 2004, IEEE Transactions on Multimedia.

[8]  Magdy A. Bayoumi,et al.  NEDA: a low-power high-performance DCT architecture , 2006, IEEE Transactions on Signal Processing.

[9]  Hyesook Lim,et al.  A Serial-Parallel Architecture for Two-Dimensional Discrete Cosine and Inverse Discrete Cosine Transforms , 2000, IEEE Trans. Computers.

[10]  G.S. Moschytz,et al.  Practical fast 1-D DCT algorithms with 11 multiplications , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[11]  Joan L. Mitchell,et al.  MPEG Video: Compression Standard , 1996 .

[12]  Mohamed I. Elmasry,et al.  Low-power implementation of discrete cosine transform , 1996, Proceedings of the Sixth Great Lakes Symposium on VLSI.

[13]  Ieee Standards Board,et al.  IEEE standard specifications for the implementations of 8x8 inverse discrete cosine transform , 1991 .

[14]  B. Lee A new algorithm to compute the discrete cosine Transform , 1984 .

[15]  Rajesh Kannan Megalingam,et al.  Novel Low Power, High Speed Hardware Implementation of 1D DCT/IDCT Using Xilinx FPGA , 2009, 2009 International Conference on Computer Technology and Development.

[16]  Wen-Hsiung Chen,et al.  A Fast Computational Algorithm for the Discrete Cosine Transform , 1977, IEEE Trans. Commun..

[17]  A. Ait Ouahman,et al.  Improved implementation of a modified Discrete Cosine Transform on low-cost FPGA , 2010, 2010 5th International Symposium On I/V Communications and Mobile Network.

[18]  Ayman Alfalou,et al.  A low-power, high-speed DCT architecture for image compression: Principle and implementation , 2010, 2010 18th IEEE/IFIP International Conference on VLSI and System-on-Chip.

[19]  K.K. Parhi,et al.  Power comparison of flow-graph and distributed arithmetic based DCT architectures , 1998, Conference Record of Thirty-Second Asilomar Conference on Signals, Systems and Computers (Cat. No.98CH36284).