A memory efficient realization of cyclic convolution and its application to discrete cosine transform

This paper presents a memory efficient design for realizing the cyclic convolution and its application to the discrete cosine transform (DCT). We adopt the method of distributed arithmetic computation, and exploit the symmetry property of DCT coefficients to merge the elements in the matrix of the DCT kernel and then separate the kernel to be two perfect cyclic forms to facilitate an efficient realization of 1-D N-point DCT using (N-1)/2 adders or subtractors, one small ROM module, a barrel shifter, and N-1/2+1 accumulators. The comparison results with the existing designs show that the proposed design can reduce delay-area product significantly.

[1]  Jiun-In Guo An efficient design for one-dimensional discrete Hartley transform using parallel additions , 2000, IEEE Trans. Signal Process..

[2]  Wayne P. Burleson,et al.  A VLSI design methodology for distributed arithmetic , 1991, J. VLSI Signal Process..

[3]  Jun Rim Choi,et al.  A compatible DCT/IDCT architecture using hardwired distributed arithmetic , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[4]  Jhing-Fa Wang,et al.  A high throughput-rate architecture for 8*8 2D DCT , 1993, 1993 IEEE International Symposium on Circuits and Systems.

[5]  Chein-Wei Jen,et al.  A new group distributed arithmetic design for the one dimensional discrete Fourier transform , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[6]  Alan N. Willson,et al.  A 100 MHz 2-D 8×8 DCT/IDCT processor for HDTV applications , 1995, IEEE Trans. Circuits Syst. Video Technol..

[7]  Rainer Laur,et al.  On the comparison between architectures for the implementation of distributed arithmetic , 1993, 1993 IEEE International Symposium on Circuits and Systems.

[8]  Nicolas Demassieux,et al.  Optimal VLSI architecture for distributed arithmetic-based algorithms , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Chein-Wei Jen,et al.  The efficient memory-based VLSI array designs for DFT and DCT , 1992 .

[10]  Jiun-In Guo,et al.  Efficient parallel adder based design for one-dimensional discrete cosine transform , 2000 .

[11]  J. Cooley,et al.  New algorithms for digital convolution , 1977 .

[12]  W. Siu,et al.  On the realization of discrete cosine transform using the distributed arithmetic , 1992 .

[13]  Jung-Pal Choi Seung,et al.  EFFICIENT ROM SIZE REDUCTION FOR .DISTRIBUTED ARITHMETIC , 2000 .

[14]  Chung-Yu Wu,et al.  A 0.5 mu A offset-free current comparator for high precision current-mode signal processing , 1991, 1991., IEEE International Sympoisum on Circuits and Systems.

[15]  S.A. White,et al.  Applications of distributed arithmetic to digital signal processing: a tutorial review , 1989, IEEE ASSP Magazine.

[16]  JIUN-IN GUO,et al.  An efficient parallel adder based design for one dimensional discrete Fourier transform , 2000 .

[17]  Chein-Wei Jen,et al.  New distributed arithmetic algorithm and its application to IDCT , 1999 .