Data Reuse Exploration for Low Power Motion Estimation Architecture Design in H.264 Encoder

Data access usually leads to more than 50% of the power cost in a modern signal processing system. To realize a low-power design, how to reduce the memory access power is a critical issue. Data reuse (DR) is a technique that recycles the data read from memory and can be used to reduce memory access power. In this paper, a systematic method of DR exploration for low-power architecture design is presented. For a start, the signal processing algorithms should be formulated as the nested loops structures, and data locality is explored by use of loop analysis. Then, corresponding DR techniques are applied to reduce memory access power. The proposed design methodology is applied to the motion estimation (ME) algorithms of H.264 video coding standard. After analyzing the ME algorithms, suitable parallel architectures and processing flows of the integer ME (IME) and fractional ME (FME) are proposed to achieve efficient DR. The amount of memory access is respectively reduced to 0.91 and 4.37% in the proposed IME and FME designs, and thus lots of memory access power is saved. Finally, the design methodology is also beneficial for other signal processing systems with a low-power consideration.

[1]  Trevor N. Mudge,et al.  Power: A First-Class Architectural Design Constraint , 2001, Computer.

[2]  Yücel Altunbasak,et al.  SAD reuse in hierarchical motion estimation for the H.264 encoder , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[3]  Hugo De Man,et al.  Strategy for power-efficient design of parallel systems , 1999, IEEE Trans. Very Large Scale Integr. Syst..

[4]  M.A. Bayoumi,et al.  Leakage sources and possible solutions in nanometer CMOS technologies , 2005, IEEE Circuits and Systems Magazine.

[5]  Liang-Gee Chen,et al.  Analysis and architecture design of variable block-size motion estimation for H.264/AVC , 2006, IEEE Transactions on Circuits and Systems I: Regular Papers.

[6]  John V. McCanny,et al.  A VLSI architecture for variable block size video motion estimation , 2004, IEEE Transactions on Circuits and Systems II: Express Briefs.

[7]  Masahiko Yoshimoto,et al.  A sub-mW MPEG-4 motion estimation processor core for mobile video application , 2004 .

[8]  Masahiko Yoshimoto,et al.  A sub-mW MPEG-4 motion estimation processor core for mobile video application , 2003, Proceedings of the IEEE 2003 Custom Integrated Circuits Conference, 2003..

[9]  H. De Man,et al.  Power exploration for data dominated video applications , 1996, Proceedings of 1996 International Symposium on Low Power Electronics and Design.

[10]  Chun Chen,et al.  Multiple-Reference-Frame Based Fast Motion Estimation & Mode Decision for H.263-to-H.264 Transcoder , 2006, 2006 International Conference on Image Processing.

[11]  Liang-Gee Chen,et al.  Low-power parallel tree architecture for full search block-matching motion estimation , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[12]  Hugo De Man,et al.  Formalized methodology for data reuse: exploration for low-power hierarchical memory mappings , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[13]  Ming-Ting Sun,et al.  Fast multiple reference frame motion estimation for H.264/AVC , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  Liang-Gee Chen,et al.  Hardware oriented content-adaptive fast algorithm for variable block-size integer motion estimation in H.264 , 2005, 2005 International Symposium on Intelligent Signal Processing and Communication Systems.

[15]  Liang-Gee Chen,et al.  Fully utilized and reusable architecture for fractional motion estimation of H.264/AVC , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[16]  Liang-Gee Chen,et al.  A 5mW MPEG4 SP encoder with 2D bandwidth-sharing motion estimation for mobile applications , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[17]  Chein-Wei Jen,et al.  On the data reuse and memory bandwidth analysis for full-search block-matching VLSI architecture , 2002, IEEE Trans. Circuits Syst. Video Technol..

[18]  Ming-Ting Sun,et al.  Fast multiple reference frame motion estimation for H.264 , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[19]  Thomas Wiegand,et al.  Draft ITU-T recommendation and final draft international standard of joint video specification , 2003 .

[20]  Hugo De Man,et al.  Power exploration for data dominated video applications , 1996, ISLPED '96.

[21]  Satoshi Goto,et al.  High performance VLSI architecture of fractional motion estimation in H.264 for HDTV , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[22]  Liang-Gee Chen,et al.  Single reference frame multiple current macroblocks scheme for multi-frame motion estimation in H.264/AVC , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[23]  Masahiko Yoshimoto,et al.  A sub-mW MPEG-4 motion estimation processor core for mobile video application , 2004 .

[24]  P. Pirsch,et al.  An SoC with two multimedia DSPs and a RISC core for video compression applications , 2004, 2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No.04CH37519).

[25]  Anantha P. Chandrakasan,et al.  Low-power CMOS digital design , 1992 .

[26]  Liang-Gee Chen,et al.  Hardware architecture design for variable block size motion estimation in MPEG-4 AVC/JVT/ITU-T H.264 , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..