A new fast DCT algorithm and its systolic VLSI implementation

The authors present a new fast algorithm along with its systolic array implementation for computing the N-point discrete cosine transform (DCT), where N is a power of two. The architecture requires log/sub 2/N multipliers and can evaluate one complete N-point DCT (i.e., N transform samples) every N clock cycles. Due to the features of regularity and modularity, it is well suited to VLSI implementation. As compared to existing systolic DCT designs with the same throughput performance, the proposed one involves much less hardware complexity.

[1]  Chin-Liang Wang,et al.  High-throughput VLSI architectures for the 1-D and 2-D discrete cosine transforms , 1995, IEEE Trans. Circuits Syst. Video Technol..

[2]  R. Clarke,et al.  Relation between the Karhunen Loève and cosine transforms , 1981 .

[3]  Peter A. Ruetz,et al.  A 160 Mpixel/s IDCT processor for HDTV , 1992, IEEE Micro.

[4]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[5]  Ming-Chang Wu,et al.  A unified systolic array for discrete cosine and sine transforms , 1991, IEEE Trans. Signal Process..

[6]  H. T. Kung Why systolic architectures? , 1982, Computer.

[7]  P. C. Jain,et al.  VLSI implementation of two-dimensional DCT processor in real time for video codec , 1992 .

[8]  Francis Jutand,et al.  A one chip VLSI for real time two-dimensional discrete cosine transform , 1988, 1988., IEEE International Symposium on Circuits and Systems.

[9]  Nam Ik Cho,et al.  DCT algorithms for VLSI parallel implementations , 1990, IEEE Trans. Acoust. Speech Signal Process..

[10]  Hsieh S. Hou A fast recursive algorithm for computing the discrete cosine transform , 1987, IEEE Trans. Acoust. Speech Signal Process..