Direct implementation of 2-D DCT on a low-cost linear-array architecture without intermediate transpose memory

A direct method for the computation of 2-D DCT on a linear-array architecture is presented. The original 2-D DCT is converted into 1-D problem with representation of matrix-vector product. Then, we propose a fast algorithm with low computation complexity, and exploit an efficient mapping technique to generate from the algorithm a hardware-efficient architecture. Unlike other 2-D DCT processors that usually require transpose memory, our new architecture is easily pipelined for purpose of high throughput rate and is easily scalable for the computation of longer-length DCT.

[1]  S.A. White,et al.  Applications of distributed arithmetic to digital signal processing: a tutorial review , 1989, IEEE ASSP Magazine.

[2]  M. J. Narasimha,et al.  On the Computation of the Discrete Cosine Transform , 1978, IEEE Trans. Commun..

[3]  Zoran Cvetkovic,et al.  New fast recursive algorithms for the computation of discrete cosine and sine transforms , 1992, IEEE Trans. Signal Process..

[4]  Liang-Gee Chen,et al.  A cost-effective architecture for 8×8 two-dimensional DCT/IDCT using direct method , 1997, IEEE Trans. Circuits Syst. Video Technol..

[5]  Shen-Fu Hsiao,et al.  A high-throughput, low power architecture and its VLSI implementation for DFT/IDFT computation , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[6]  Ting Chen,et al.  VLSI implementation of a 16*16 discrete cosine transform , 1989 .

[7]  Alan N. Willson,et al.  A 100 MHz 2-D 8×8 DCT/IDCT processor for HDTV applications , 1995, IEEE Trans. Circuits Syst. Video Technol..

[8]  Masahiko Yoshimoto,et al.  A 100-MHz 2-D discrete cosine transform core processor , 1992 .

[9]  Sung Bum Pan,et al.  Unified systolic arrays for computation of the DCT/DST/DHT , 1997, IEEE Trans. Circuits Syst. Video Technol..