High-level synthesis implementation of HEVC 2-D DCT/DST on FPGA

This paper presents the first known high-level synthesis (HLS) implementation of integer discrete cosine transform (DCT) and discrete sine transform (DST) for High Efficiency Video Coding (HEVC). The proposed approach implements these 2-D transforms by two successive 1-D transforms using a well-known row-column and Even-Odd decomposition techniques. Altogether, the proposed architecture is composed of a 4-point DCT/DST unit for the smallest transform blocks (TBs), an 8/16/32-point DCT unit for the other TBs, and a transpose memory for intermediate results. On Arria II FPGA, the low-cost variant of the proposed architecture is able to support encoding of 1080p format at 60 fps and at the cost of 10.0 kALUTs and 216 DSP blocks. The respective figures for the proposed high-speed variant are 2160p at 30 fps with 13.9 kALUTs and 344 DSP blocks. These cost-performance characteristics outperform respective non-HLS approaches on FPGA.

[1]  M. Grellert,et al.  Low cost and high throughput multiplierless design of a 16 point 1-D DCT of the new HEVC video coding standard , 2012, 2012 VIII Southern Conference on Programmable Logic.

[2]  Chuohao Yeo,et al.  Efficient Integer DCT Architectures for HEVC , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Madhukar Budagavi,et al.  Core Transform Design in the High Efficiency Video Coding (HEVC) Standard , 2013, IEEE Journal of Selected Topics in Signal Processing.

[4]  Boonchuay Supmonchai,et al.  Flexible input transform architecture for HEVC encoder on FPGA , 2015, 2015 12th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON).

[5]  F. Bossen,et al.  Common test conditions and software reference configurations , 2010 .

[6]  Anand D. Darji,et al.  High-performance multiplierless DCT architecture for HEVC , 2015, 2015 19th International Symposium on VLSI Design and Test.

[7]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Daniel Gajski,et al.  An Introduction to High-Level Synthesis , 2009, IEEE Design & Test of Computers.

[9]  Jeong-Hoon Park,et al.  Block Partitioning Structure in the HEVC Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Grzegorz Pastuszak,et al.  Algorithm and Architecture Design of the H.265/HEVC Intra Encoder , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[11]  Takao Onoye,et al.  High-performance multiplierless transform architecture for HEVC , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[12]  Grzegorz Pastuszak Hardware architectures for the H.265/HEVC discrete cosine transform , 2015, IET Image Process..