A FPGA Implementation of Farneback Optical Flow by High-Level Synthesis

Optical flow algorithm, which estimates the motion detection of consequent video frames, is widely used in surveillance system, Advanced Driver Assistance Systems (ADAS) and object movement estimation in scene analysis. Among different optical flow algorithms, Farneback version provides a better accuracy and brightness-change-resistant displacements by estimating the flow from polynomial domain rather than intensive maps. However, high computation complexity and inconsistent data access patterns make it difficult to be implemented on a hardware platform. In this work, we present a micro-architecture design of Farneback optical flow, which is flexible for optimization with high level Synthesis (HLS) tools. The original software-based implementation was decomposed into functional blocks to balance latency of different stages and flows of data were rearranged to accommodate better memory access patterns. The data flow arrangement is based on a proposed backtrace mechanism, where DRAM accesses of polynomial coefficients in current frame makes consistent traffic patterns, and therefore make it possible to integrate more functional blocks into a deeper pipeline. For several micro-architecture design versions, we demonstrate options of fixed and floating points, optimization techniques such as multiple DMAs and different levels of pipeline integration. We implemented our design on Zedboard Mini-ITX 7045. The results show a 17x end-to-end speedup against a naive HLS version with an image size of 160x120. Considering only the hardware-accelerated part, our FPGA implementation is 40x faster than the naive HLS version with only 50% of the FPGA hardware resources.

[1]  Brad L. Hutchings,et al.  Implementing high-performance, low-power FPGA-based optical flow accelerators in C , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.

[2]  Dah-Jye Lee,et al.  Real-Time Optical Flow Calculations on FPGA and GPU Architectures: A Comparison Study , 2008, 2008 16th International Symposium on Field-Programmable Custom Computing Machines.

[3]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[4]  Gunnar Farnebäck,et al.  Fast and Accurate Motion Estimation Using Orientation Tensors and Parametric Motion Models , 2000, ICPR.

[5]  Javier Díaz,et al.  Parallel Architecture for Hierarchical Optical Flow Estimation Based on FPGA , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Peter Zipf,et al.  An FPGA-optimized architecture of horn and schunck optical flow algorithm for real-time applications , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[7]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[8]  Gunnar Farnebäck,et al.  Two-Frame Motion Estimation Based on Polynomial Expansion , 2003, SCIA.

[9]  David J. Fleet,et al.  Performance of optical flow techniques , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Eduardo Ros,et al.  A Comparison of FPGA and GPU for Real-Time Phase-Based Optical Flow, Stereo, and Local Image Features , 2012, IEEE Transactions on Computers.

[11]  Gunnar Farnebäck,et al.  Polynomial expansion for orientation and motion estimation , 2002 .

[12]  Rama Chellappa,et al.  Accuracy vs Efficiency Trade-offs in Optical Flow Algorithms , 1996, Comput. Vis. Image Underst..