A Cross-Platform OpenVX Library for FPGA Accelerators

In Computer Vision, open programming standards such as OpenVX have emerged to bring together portability and acceleration across devices. Unfortunately, achieving both goals on FPGAs remains a challenge because FPGAs still require to adapt the code with proprietary extensions. Exclusively for Xilinx devices, the HiFlipVX open source library partially solves this problem by offering a clean C++ OpenVX API that offers the performance of proprietary extensions without exposing its complexity to programmer. While HiFlipVX enables portability within Xilinx devices, portability between FPGA manufacturers remains an open challenge. This work extends the HiFlipVX’s capabilities with a twofold goal: i) to support Intel FPGA devices with different memory configurations, and ii) to enable execution on FPGAs as discrete accelerators. To accomplish these goals, the proposed implementation combines two HLS programming models: C++, using Intel’s system of tasks that enables to coalesce nodes and reduce control overhead, and OpenCL, which provides efficient compute kernel nodes. On Intel FPGAs, compared with pure OpenCL implementations, the proposed implementation reduces kernel dispatch resources, saving up to 24% of ALUT resources for each kernel in a graph, and improves performance. Gains are $2.6\times$ on average for representative applications, such as Canny edge detector, or Census transform, compared with state-of-the-art frameworks.

[1]  Ananya Muddukrishna,et al.  Supporting Utilities for Heterogeneous Embedded Image Processing Platforms (STHEM): An Overview , 2018, ARC.

[2]  Xuan Yang,et al.  Programming Heterogeneous Systems from an Image Processing DSL , 2016, ACM Trans. Archit. Code Optim..

[3]  Onur Mutlu,et al.  Boyi: A Systematic Framework for Automatically Deciding the Right Execution Model of OpenCL Applications on FPGAs , 2020, FPGA.

[4]  Satoshi Matsuoka,et al.  Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL , 2018, FPGA.

[5]  Jason Cong,et al.  HeteroHalide: From Image Processing DSL to Efficient FPGA Acceleration , 2020, FPGA.

[6]  Diana Göhringer,et al.  Resource Efficient Dynamic Voltage and Frequency Scaling on Xilinx FPGAs , 2020, ARC.

[7]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[8]  Diana Göhringer,et al.  HiFlipVX: An Open Source High-Level Synthesis FPGA Library for Image Processing , 2019, ARC.

[9]  Alexander V. Veidenbaum,et al.  Acceleration Framework for FPGA Implementation of OpenVX Graph Pipelines , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[10]  Muhammad Ali,et al.  Low Power Image Processing Applications on FPGAs Using Dynamic Voltage Scaling and Partial Reconfiguration , 2018, 2018 Conference on Design and Architectures for Signal and Image Processing (DASIP).

[11]  Kari Pulli,et al.  OpenVX: a framework for accelerating computer vision , 2016, SIGGRAPH ASIA Courses.

[12]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[13]  Wayne Luk,et al.  Performance Portable FPGA Design , 2020, FPGA.

[14]  María Villarroya-Gaudó,et al.  An Analytical Model of Memory-Bound Applications Compiled with High Level Synthesis , 2020, 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).

[15]  Martyn P. Nash,et al.  Suitability of recent hardware accelerators (DSPs, FPGAs, and GPUs) for computer vision and image processing algorithms , 2018, Signal Process. Image Commun..

[16]  Uday Bondhugula,et al.  A DSL compiler for accelerating image processing pipelines on FPGAs , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).

[17]  Dejan S. Milojicic,et al.  Analysis and Modeling of Collaborative Execution Strategies for Heterogeneous CPU-FPGA Architectures , 2019, ICPE.

[18]  Jürgen Teich,et al.  Generating FPGA-based image processing accelerators with Hipacc: (Invited paper) , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[19]  Alexander V. Veidenbaum,et al.  AFFIX: Automatic Acceleration Framework for FPGA Implementation of OpenVX Vision Algorithms , 2019, FPGA.