Optimization Strategies for High-Performance Computing of Optical-Flow in General-Purpose Processors

In this paper, we describe the high-performance implementation of an optical-flow algorithm that takes advantage of the processor's architecture. Tuning the code, i.e., adapting it to take full advantage of the processor, is challenging, time consuming, and requires efficient programming at different levels but can lead to significant improvements in performance. The optimized implementation presented here is highly interesting for a number of applications since it delivers real-time motion estimations at high-image resolution on a PC or in an embedded system based on a general-purpose processor. In a 2.83 GHz Core 2 Quad PC, it achieves a speedup of 14 compared to our first code version and 2052.7f/s for the well-known 252 times 316 Yosemite sequence, and a speedup of 17.6 and 68.5 f/s for a 1016 times 1280 sequence. But the description of how this high-performance is achieved goes beyond a specific application since the paper presented here illustrates how inherently dense, low-level visual algorithms (pixel-wise computation) can be structured and improved to take full advantage of a standard processor. The implementation is compared with other hardware (based on FPGAs and GPUs) and software (based on clusters, PCs, and special-purpose processors) optical-flow implementations, showing that it outperforms them.

[1]  R. Lane,et al.  Measuring confidence in optical flow estimation , 1996 .

[2]  Aurélio J. C. Campilho,et al.  Real-time implementation of an optical flow algorithm , 2002, Object recognition supported by user interaction for service robots.

[3]  Alberto Prieto,et al.  Fine grain pipeline systems for real-time motion and stereo-vision computation , 2007, Int. J. High Perform. Syst. Archit..

[4]  Mancia Anguita,et al.  MP3 optimization exploiting processor architecture and using better algorithms , 2005, IEEE Micro.

[5]  Peter Thoman,et al.  GPU-Based Multigrid: Real-Time Performance in High Resolution Nonlinear Image Processing , 2008, ICVS.

[6]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[7]  Ulrich Rüde,et al.  3D optical flow computation using a parallel variational multigrid scheme with application to cardiac C-arm CT motion , 2007, Image Vis. Comput..

[8]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[9]  Marc M. Van Hulle,et al.  Realtime phase-based optical flow on the GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Frédéric Champagnat,et al.  B-Spline Image Model for Energy Minimization-Based Optical Flow Estimation , 2006, IEEE Transactions on Image Processing.

[11]  Javier Díaz,et al.  FPGA-based real-time optical-flow system , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Timo Kohlberger,et al.  Domain decomposition for variational optical-flow computation , 2005, IEEE Transactions on Image Processing.

[13]  Timo Kohlberger,et al.  Variational optical flow computation in real time , 2005, IEEE Transactions on Image Processing.

[14]  Javier Díaz,et al.  Superpipelined high-performance optical-flow computation architecture , 2008, Comput. Vis. Image Underst..

[15]  Horst Bischof,et al.  A Duality Based Approach for Realtime TV-L1 Optical Flow , 2007, DAGM-Symposium.

[16]  Antonio García Dopico,et al.  Parallel Computation of Optical Flow , 2004, ICIAR.

[17]  Jonathan W. Brandt,et al.  Improved Accuracy in Gradient-Based Optical Flow Estimation , 1997, International Journal of Computer Vision.

[18]  Joachim Weickert,et al.  High Performance Parallel Optical Flow Algorithms on the Sony Playstation 3 , 2008, VMV.

[19]  Ville Lappalainen,et al.  Overview of research efforts on media ISA extensions and their usage in video coding , 2002, IEEE Trans. Circuits Syst. Video Technol..

[20]  Eduardo Ros,et al.  FPGA-based architecture for motion sequence extraction , 2007 .

[21]  Unai Bidarte,et al.  Hardware implementation of optical flow constraint equation using FPGAs , 2005, Comput. Vis. Image Underst..

[22]  Dah-Jye Lee,et al.  FPGA-Based Embedded Motion Estimation Sensor , 2008, Int. J. Reconfigurable Comput..

[23]  Ali Tabatabai,et al.  Motion Estimation Methods for Video Compression—A Review , 1998 .

[24]  Alex Pentland,et al.  3D structure from 2D motion , 1999, IEEE Signal Process. Mag..

[25]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[26]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[27]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[28]  M. Omair Ahmad,et al.  Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[29]  Rama Chellappa,et al.  Accuracy vs Efficiency Trade-offs in Optical Flow Algorithms , 1996, Comput. Vis. Image Underst..

[30]  Abbas El Gamal,et al.  Optical flow estimation using temporally oversampled video , 2005, IEEE Transactions on Image Processing.