Toward a pixel-parallel architecture for graph cuts inference on FPGA

The method of Graph Cuts converts a Maximum a Posteriori (MAP) inference problem on a Markov Random Field (MRF) into a network flow, which can be solved efficiently. Many computer vision problems can be conveniently cast as an inference task to find most likely labels for pixels. The method is widely used, but computationally burdensome. Prior accelerator attempts have failed to exploit the problem's attractive, maximum available parallelism: push-relabel flow solvers can run in parallel across every pixel. This paper describes the design and implementation of the first pixel-parallel Graph Cuts inference engine. Our prototype implements a 256-pixel tile of an image, implemented as 256 locally-connected pixel processors. A checkerboard scheduling scheme allows for maximum parallelism while avoiding critical data dependencies. A 150MHz implementation on an FPGA can solve a segmentation task in 6 microseconds. We also discuss strategies for extending our prototype to larger "virtual" images that span more than the physical extent of the inference tile. Our model suggests 2–40× speedups compared with previous accelerator experiments. To the best of our knowledge, this is the first fully functional, pixelparallel accelerator demonstration for Graph Cuts inference.

[1]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  P. J. Narayanan,et al.  CUDA cuts: Fast graph cuts on the GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[3]  Tsutomu Maruyama,et al.  An acceleration of a graph cut segmentation with FPGA , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).

[4]  Andrew V. Goldberg,et al.  On Implementing the Push—Relabel Method for the Maximum Flow Problem , 1997, Algorithmica.

[5]  William Thomas Blank A bit map architecture and algorithms for design automation , 1982 .

[6]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Ioannis Papaefstathiou,et al.  Highly efficient reconfigurable parallel graph cuts for embedded vision , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[8]  Rob A. Rutenbar,et al.  Video-Rate Stereo Matching Using Markov Random Field TRW-S Inference on a Hybrid CPU+FPGA Computing Platform , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[9]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..