Short-critical-path and structurally orthogonal scaled CORDIC-based approximations of the eight-point discrete cosine transform

A family of multiplierless transforms is presented that approximate the eight-point type-II discrete cosine transform (DCT) as accurately as the state-of-the-art scaled DCT schemes, but having 14-17% shorter critical paths (1/6 or 1/7 less adders). Compared to the existing solutions that use the coordinate rotation digital computer (CORDIC) algorithm, the advantage of higher throughput is accompanied by saving additions. Only some lifting-based BinDCT schemes require less adders in total, in spite of longer critical paths. The transforms have been derived from the fast Loeffler's algorithm by replacing the rotation stage with unfolded CORDIC iterations, which have been arranged so that two rotation approximations use the same scaling. This is equivalent to imposing structural orthogonality (losslessness) on a system, from which the scaling can then be extracted so as to shorten the critical path. Supporting ideas are a notation for more conveniently describing CORDIC circuits, and an angle conversion that allows rotations to be approximated using an extended set of CORDIC circuits. The research results have been validated by field programmable gate array-based hardware design experiments and by usability tests based on a software JPEG codec.

[1]  Earl E. Swartzlander,et al.  A scaled DCT architecture with the CORDIC algorithm , 2002, IEEE Trans. Signal Process..

[2]  Laurence E. Turner,et al.  Rapid Prototyping of Field Programmable Gate Array-Based Discrete Cosine Transform Approximations , 2003, EURASIP J. Adv. Signal Process..

[3]  An-Yeu Wu,et al.  A unified view for vector rotational CORDIC algorithms and architectures based on angle quantization approach , 2002 .

[4]  Yu Hen Hu,et al.  Efficient VLSI implementations of fast multiplierless approximated DCT using parameterized hardware modules for silicon intellectual property design , 2005, IEEE Transactions on Circuits and Systems I: Regular Papers.

[5]  Arjuna Madanayake,et al.  A Row-Parallel 8$\,\times\,$ 8 2-D DCT Architecture Using Algebraic Integer-Based Exact Computation , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  C. S. Burrus,et al.  Parameterization of orthogonal wavelet transforms and their implementation , 1998 .

[7]  Wen-Hsiung Chen,et al.  A Fast Computational Algorithm for the Discrete Cosine Transform , 1977, IEEE Trans. Commun..

[8]  Trac D. Tran,et al.  Fast multiplierless approximations of the DCT with the lifting scheme , 2001, IEEE Trans. Signal Process..

[9]  Pierre Moulin,et al.  A multiscale relaxation algorithm for SNR maximization in nonorthogonal subband coding , 1995, IEEE Trans. Image Process..

[10]  G.S. Moschytz,et al.  Practical fast 1-D DCT algorithms with 11 multiplications , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[11]  M. Parfieniuk Shortening the critical path in CORDIC-based approximations of the eight-point DCT , 2008, 2008 International Conference on Signals and Electronic Systems.

[12]  P. P. Vaidyanathan,et al.  The role of lossless systems in modern digital signal processing: a tutorial , 1989 .

[13]  Alexander A. Petrovsky,et al.  Structurally Orthogonal Finite Precision Implementation of the Eight Point DCT , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[14]  Vladimir Britanak New universal rotation-based fast computational structures for an efficient implementation of the DCT-IV/DST-IV and analysis/synthesis MDCT/MDST filter banks , 2009, Signal Process..

[15]  K. Sridharan,et al.  50 Years of CORDIC: Algorithms, Architectures, and Applications , 2009, IEEE Transactions on Circuits and Systems I: Regular Papers.

[16]  I. Daubechies,et al.  Wavelet Transforms That Map Integers to Integers , 1998 .

[17]  An-Yeu Wu,et al.  A high-performance/low-latency vector rotational CORDIC architecture based on extended elementary angle set and trellis-based searching schemes , 2003, IEEE Trans. Circuits Syst. II Express Briefs.

[18]  Truong Q. Nguyen,et al.  Multirate filter banks and transform coding gain , 1998, IEEE Trans. Signal Process..

[19]  Shanq-Jang Ruan,et al.  Low-power and high-quality Cordic-based Loeffler DCT for signal processing , 2007, IET Circuits Devices Syst..

[20]  Jinsang Kim,et al.  Low-power multiplierless DCT architecture using image correlation , 2004, IEEE Trans. Consumer Electron..

[21]  Zhongfeng Wang,et al.  An improved scaled DCT architecture , 2009, IEEE Transactions on Consumer Electronics.