An Efficiency-Driven Approach For Real-Time Optical Flow Processing On Parallel Hardware

This article tackles the entire lifecycle of an algorithm: from its design to its implementation. It exhibits a method for making efficient choices at algorithm design time knowing the characteristics of the underlying hardware target. As of today, computing the optical flow of a stream of images is still a demanding task. In the meantime, the use of Graphics Processing Units (GPU) has become mainstream and allows substantial gains in processing frame rate. In this paper, we focus on a specific variational method (CLG [1]) where linear systems have to be solved. They depend on two parameters $\alpha$ and $\rho$. To efficiently solve the problem, we look at convergence speed with respect to the model’s parameters. We benchmark usual linear solvers with preconditioners to identify the fastest in terms of convergence per iteration. We then show that once implemented on GPUs, the most efficient solver changes depending on the model parameters. For $640 \times 480$ images, with the right choice of solver and parameters, our implementation can solve the system with relative $10 e^{-8}$ accuracy in 15 ms on a Titan V GPU. All the results are aggregated on a 30-image set to increase confidence in their extendability.