High parallel disparity map computing on FPGA

In this paper we present a method for disparity map computing and its correspondent high parallel hardware accelerator. Our solution considers a two step processing algorithm. First, we compute a one-dimensional biased sum of absolute differences, and later a spurious removal technique is performed to eliminate wrong estimations. The hardware accelerator introduces a memory organization, an address generation scheme and data-path units that have scalable features for several resolutions, frame rates, silicon use, and power consumption instantiations. We have implemented a five stage pipelined organization that operates at 174.5 MHz over an VIRTEX II PRO 2vp30fg676-7 FPGA device, carries out an equivalent of 9.074 GOPS and processes 142 frames per second of Common Intermediate Format (CIF).

[1]  JunSeong Kim,et al.  Real-Time Stereo Vision on a Reconfigurable System , 2005, SAMOS.

[2]  Stamatis Vassiliadis,et al.  The sum-absolute-difference motion estimation accelerator , 1998, Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204).

[3]  E. Swartzlander Merged Arithmetic , 1980, IEEE Transactions on Computers.

[4]  Iain E. G. Richardson,et al.  H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia , 2003 .

[5]  Soontorn Oraintara,et al.  Complexity comparison of fast block-matching motion estimation algorithms , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Carlo Tomasi,et al.  Depth Discontinuities by Pixel-to-Pixel Stereo , 1999, International Journal of Computer Vision.

[7]  Georgi Gaydadjiev,et al.  Memory Organization with Multi-Pattern Parallel Accesses , 2008, 2008 Design, Automation and Test in Europe.

[8]  Stamatis Vassiliadis,et al.  High-Performance 3-1 Interlock Collapsing ALU's , 1994, IEEE Trans. Computers.

[9]  Hsueh-Ming Hang,et al.  H.264/AVC motion estimation implmentation on Compute Unified Device Architecture (CUDA) , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[10]  Santanu Dutta,et al.  A flexible parallel architecture adapted to block-matching motion-estimation algorithms , 1996, IEEE Trans. Circuits Syst. Video Technol..

[11]  Sikun Li,et al.  A novel configurable motion estimation architecture for high-efficiency MPEG-4/H.264 encoding , 2005, Proceedings of the ASP-DAC 2005. Asia and South Pacific Design Automation Conference, 2005..

[12]  Stamatis Vassiliadis,et al.  Fast Computation of Compound Expressions in Two's Complement Notation , 2007 .

[13]  Christopher S. Wallace,et al.  A Suggestion for a Fast Multiplier , 1964, IEEE Trans. Electron. Comput..