Performance-optimized FPGA implementation for the flexible triangle search block-based motion estimation algorithm

This paper presents a performance-optimized version of the flexible triangle (FTS) block-matching search algorithm. The FTS is a fast block-matching algorithm for motion estimation proposed in previous work that, given a block of pixels, is used to search for the best-matching block in a given search area using only a selected subset of available positions rather than searching all available positions as done by full search algorithm which is computationally very expensive. Further analysis to previous FPGA implementation of the FTS indicates that additional parallelism can be employed to improve the overall processing time of the FTS algorithm. In addition to this, investigating the performance bottlenecks and redesigning some of the used hardware modules can increase the maximum supported frequency for the entire FTS FPGA implementation. The proposed design changes were implemented in VHDL and synthesized for using Xilinx virtex-5. Simulation results indicate that the proposed implementation reduced the average number of cycles required to process a block by 17%. Moreover, synthesis results indicate that the proposed design is able to increase the maximum supported frequency by around 38% compared to the previous FPGA implementation of the FTS algorithm. Consequently, the maximum supported frame rate has been increased by around 66%.

[1]  Yao Wang,et al.  Video Processing and Communications , 2001 .

[2]  Lap-Pui Chau,et al.  Hexagon-based search pattern for fast block motion estimation , 2002, IEEE Trans. Circuits Syst. Video Technol..

[3]  P. Agathoklis,et al.  Block-based motion estimation using an enhanced flexible triangle search algorithm , 2005, Canadian Conference on Electrical and Computer Engineering, 2005..

[4]  D. Samanta,et al.  VLSI architecture for multi-resolution three step search algorithm , 2003, ASIC, 2003. Proceedings. 5th International Conference on.

[5]  Yap-Peng Tan,et al.  Adaptive dual-cross search algorithm for block-matching motion estimation , 2004, IEEE Trans. Consumer Electron..

[6]  M. Mohammadzadeh,et al.  An optimized systolic array architecture for full search block matching algorithm and its implementation on FPGA chips , 2005, The 3rd International IEEE-NEWCAS Conference, 2005..

[7]  Liang-Gee Chen,et al.  A novel hybrid motion estimator supporting diamond search and fast full search , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[8]  Tihao Chiang,et al.  A hierarchical N-Queen decimation lattice and hardware architecture for motion estimation , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Kai-Kuang Ma,et al.  Adaptive rood pattern search for fast block-matching motion estimation , 2002, IEEE Trans. Image Process..

[10]  Sang-Seol Lee,et al.  A 4-way pipelined processing architecture for three step search block-matching motion estimation , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[11]  Kai-Kuang Ma,et al.  A new diamond search algorithm for fast block-matching motion estimation , 2000, IEEE Trans. Image Process..

[12]  Peter Kuhn,et al.  Algorithms, Complexity Analysis and VLSI Architectures for MPEG-4 Motion Estimation , 1999, Springer US.

[13]  K. Wiatr,et al.  Motion estimation operation implemented in FPGA chips for real-time image compression , 2001, ISPA 2001. Proceedings of the 2nd International Symposium on Image and Signal Processing and Analysis. In conjunction with 23rd International Conference on Information Technology Interfaces (IEEE Cat..

[14]  Kai-Kuang Ma,et al.  Correction to "a new diamond search algorithm for fast block-matching motion estimation" , 2000, IEEE Trans. Image Process..

[15]  Iain E. G. Richardson,et al.  H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia , 2003 .

[16]  Hassan M. El Kamchouchi,et al.  Computation-efficient FPGA implementation for flexible triangle search block-based motion estimation algorithm , 2010, CCECE 2010.