High throughput pipelined 2D Discrete cosine transform for video compression

This paper proposes an architecture and Verilog design of fast pipelined Two Dimensional Discrete Cosine Transform (2D DCT) on FPGA with quantization which can be used as a core in video compression hardware. In this design, the methodologies adopted are to use highly parallel and heavily pipelined circuits in order to increase the throughput and to be platform independent, whether an implementation uses a FPGA or an ASIC. The scheme incorporates dual-redundant input image memory, 45 stages of pipelining, and an optimized controller design yielding a throughput of one coefficient per clock cycle at 100 MHz. Speed improvement of 30 percent has been achieved and hardware resource are efficiently saved by reducing arithmetic operators. This design aimed to be implemented on Xilinx Spartan 3E XC3S1500E FPGA.

[1]  S. Bampi,et al.  Pipelined fast 2D DCT architecture for JPEG image compression , 2001, Symposium on Integrated Circuits and Systems Design.

[2]  Chiu Ngo,et al.  Adaptive Multi-beam Transmission of Uncompressed Video over 60GHz Wireless Systems , 2007, Future Generation Communication and Networking (FGCN 2007).

[3]  PeiZong Lee,et al.  An efficient prime-factor algorithm for the discrete cosine transform and its hardware implementations , 1994, IEEE Trans. Signal Process..

[4]  Thomas Sri Widodo,et al.  FPGA implementation of pipelined 2D-DCT and quantization architecture for JPEG image compression , 2010, 2010 International Symposium on Information Technology.

[5]  Ja-Ling Wu,et al.  MMX-based DCT and MC algorithms for real-time pure software MPEG decoding , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[6]  Liang-Gee Chen,et al.  A cost-effective architecture for 8×8 two-dimensional DCT/IDCT using direct method , 1997, IEEE Trans. Circuits Syst. Video Technol..

[7]  S. Ramachandran,et al.  EPLD-based architecture of real time 2D-discrete cosine transform and quantization for image compression , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).

[8]  S. Ramachandran,et al.  Parallel implementation of 2D-discrete cosine transform using EPLDs , 1999, Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013).

[9]  Chin-Liang Wang,et al.  High-throughput VLSI architectures for the 1-D and 2-D discrete cosine transforms , 1995, IEEE Trans. Circuits Syst. Video Technol..

[10]  Ting Chen,et al.  VLSI implementation of a 16*16 discrete cosine transform , 1989 .