Maximizing memory data reuse for lower power motion estimation

This paper presents a new VLSI architecture of the Motion Estimation in MPEG-2. Previously, a number of full search block matching algorithms (BMA) and architectures using systolic array have been proposed for motion estimation. However, the architectures have an inefficiently large number of external memory access. Recently, to reduce the number of accesses in one search block, a block matching method within a search area to reuse the search data is provided using systolic process arrays. To further reduce the data access and computation time during the block matching, we propose a new approach through the reuse of the previously-search data in two dimensions. Our new architecture in this paper is an extension from our previous work such that we reuse the previously searches area not only between two consecutive columns but also between two consecutive rows, so as to entirely remove redundant memory access. Experimental results show the efficiency of our algorithm.

[1]  Dongju Li,et al.  Dedicated Design of Motion Estimator with Bits Truncation Fast Algorithm(Special Section on Digital Signal Processing) , 1998 .

[2]  Liang-Gee Chen,et al.  Parallel architectures for 3-step hierarchical search block-matching algorithm , 1994, IEEE Trans. Circuits Syst. Video Technol..

[3]  Yeong-Kang Lai,et al.  A data-interlacing architecture with two-dimensional data-reuse for full-search block-matching algorithm , 1998, IEEE Trans. Circuits Syst. Video Technol..

[4]  Katsuji Aoki,et al.  A motion estimation processor for MPEG2 video real time encoding at wide search range , 1995, Proceedings of the IEEE 1995 Custom Integrated Circuits Conference.

[5]  Bede Liu,et al.  New fast algorithms for the estimation of block motion vectors , 1993, IEEE Trans. Circuits Syst. Video Technol..

[6]  Bing Zeng,et al.  A new three-step search algorithm for block motion estimation , 1994, IEEE Trans. Circuits Syst. Video Technol..

[7]  Fionn Murtagh,et al.  Image processing and data analysis , 1998 .

[8]  Stephen A. Dyer,et al.  Digital signal processing , 2018, 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004..

[9]  Andreas Graf,et al.  A chip set for MPEG-2 video encoding , 1995, Proceedings of the IEEE 1995 Custom Integrated Circuits Conference.

[10]  Lurng-Kuo Liu,et al.  A block-based gradient descent search algorithm for block motion estimation in video coding , 1996, IEEE Trans. Circuits Syst. Video Technol..

[11]  Chein-Wei Jen,et al.  An architecture of full-search block matching for minimum memory bandwidth requirement , 1998, Proceedings of the 8th Great Lakes Symposium on VLSI (Cat. No.98TB100222).

[12]  Masahiko Yoshimoto,et al.  A half-pel precision motion estimation processor for NTSC-resolution video , 1993, Proceedings of IEEE Custom Integrated Circuits Conference - CICC '93.

[13]  Hsueh-Ming Hang,et al.  A comparison of block-matching algorithms mapped to systolic-array implementation , 1997, IEEE Trans. Circuits Syst. Video Technol..

[14]  Takao Onoye,et al.  Single Chip Implementation of Motion Estimator Dedicated to MPEG2 MP@HL (Special Section on Digital Signal Processing) , 1996 .

[15]  Dongju Li,et al.  Towards one chip HDTV MPEG2 encoder LSI , 1998, Proceedings of the IEEE 1998 Custom Integrated Circuits Conference (Cat. No.98CH36143).

[16]  Ming-Ting Sun,et al.  A family of vlsi designs for the motion compensation block-matching algorithm , 1989 .

[17]  Masahiko Yoshimoto,et al.  A chip set architecture for programmable real-time MPEG2 video encoder , 1995, Proceedings of the IEEE 1995 Custom Integrated Circuits Conference.

[18]  Fionn Murtagh,et al.  Image Processing and Data Analysis - The Multiscale Approach , 1998 .

[19]  Masaru Takahashi,et al.  A low-power single-chip MPEG-2 CODEC LSI , 1999, Proceedings of the IEEE 1999 Custom Integrated Circuits Conference (Cat. No.99CH36327).

[20]  Fadi J. Kurdahi,et al.  Kernel scheduling in reconfigurable computing , 1999, DATE '99.

[21]  C.-C. Jay Kuo,et al.  Fast motion vector estimation using multiresolution-spatio-temporal correlations , 1997, IEEE Trans. Circuits Syst. Video Technol..

[22]  Liang-Gee Chen,et al.  An efficient and simple VLSI tree architecture for motion estimation algorithms , 1993, IEEE Trans. Signal Process..

[23]  Alan N. Willson,et al.  Rate-distortion optimal motion estimation algorithms for motion-compensated transform video coding , 1998, IEEE Trans. Circuits Syst. Video Technol..

[24]  Bo-Sung Kim,et al.  VLSI architecture for low power motion estimation using high data access reuse , 1999, AP-ASIC'99. First IEEE Asia Pacific Conference on ASICs (Cat. No.99EX360).

[25]  I. Tamitani,et al.  A 1.5-W single-chip MPEG-2 MP@ML video encoder with low power motion estimation and clocking , 1997, IEEE J. Solid State Circuits.

[26]  Hayder Radha,et al.  Image processing performance evaluation for DSP based parallel computers with distributed frame buffers , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[27]  Keshab K. Parhi,et al.  Digital Signal Processing for Multimedia Systems , 1999 .

[28]  Alexander D. Stoyenko,et al.  Real-Time Imaging: Theory, Techniques, and Application , 1996 .

[29]  Peter Pirsch,et al.  Array architectures for block matching algorithms , 1989 .

[30]  Jie Chen,et al.  A complete pipelined parallel CORDIC architecture for motion estimation , 1997, Proceedings of 1997 IEEE International Symposium on Circuits and Systems. Circuits and Systems in the Information Age ISCAS '97.

[31]  Hon-Son Don,et al.  VLSI architecture for digital picture comparison , 1989 .

[32]  Hiroaki Kunieda,et al.  Bits Truncation Adaptive Pyramid Algorithm for Motion Estimation of MPEG2 (Special Section on Digital Signal Processing) , 1997 .

[33]  Juan M. Meneses,et al.  VLSI architecture for motion estimation using the block-matching algorithm , 1996, Proceedings ED&TC European Design and Test Conference.

[34]  Michael Stegherr,et al.  Parameterizable VLSI architectures for the full-search block-matching algorithm , 1989 .