A high-performance and memory-efficient architecture for H.264/AVC motion estimation

Variable-block-size motion estimation (VBSME) is a major contributor to H.264/AVCpsilas excellent coding efficiency. However, its high computational complexity and memory requirement make design difficult. In this paper, we propose a memory-efficient hardware architecture for full-search VBSME (FSVBSME). Our architecture consists of sixteen 2-D arrays each consists of 16 times16 processing elements (PEs). Four arrays form a group to match in parallel four reference blocks against one current block. Four groups perform block matching for four current blocks in a consecutive and overlapped fashion. Taking advantage of reference pixel overlapping between multiple reference blocks of a current block and between search windows of several adjacent current blocks, we propose a novel data reuse scheme to reduce memory access. Compared with the popular Level C data reuse method, our design can save 98% of on-chip memory access with only 27% of memory overhead. Synthesized into a TSMC 130nm CMOS cell library, our design takes 453K logic gates and 2.94 K bytes of on-chip memory. Running at 130 MHz, it is capable of processing 1920 times 1088 30 fps video with 64times64 search range (SR) and two reference frames (RF). We suggest a criterion called design efficiency for comparing different related work. It shows that our design is 27% more efficient than the best design to date.

[1]  Liang-Gee Chen,et al.  Analysis and architecture design of an HDTV720p 30 frames/s H.264/AVC encoder , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Chein-Wei Jen,et al.  On the data reuse and memory bandwidth analysis for full-search block-matching VLSI architecture , 2002, IEEE Trans. Circuits Syst. Video Technol..

[3]  Liang-Gee Chen,et al.  Hardware architecture design for variable block size motion estimation in MPEG-4 AVC/JVT/ITU-T H.264 , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[4]  Minho Kim,et al.  A fast VLSI architecture for full-search variable block size motion estimation in MPEG-4 AVC/H.264 , 2005, ASP-DAC '05.