A Highly Efficient and Comprehensive Image Processing Library for C + +-based High-Level Synthesis

Field Programmable Gate Arrays (FPGAs) are proved to be among the most suitable architectures for image processing applications. However, accelerating algorithms using FPGAs is a time-consuming task and needs expertise. Whereas the recent advancements in High-Level Synthesis (HLS) promise to solve this problem, today’s HLS tools require efficient hardware descriptions of algorithms to be able to provide favorable implementations. A solution is developing highly parameterizable and optimized HLS libraries for the fundamental image processing components. Another solution is providing a higher level of abstraction in the form of a Domain-Specific Language (DSL) and a corresponding efficient back end for hardware design. In this paper, we provide a highly efficient and parameterizable C++ library for image processing applications, which would be the cornerstone for both approaches. In our library, nodes of a stream-based data flow graph can be described as C++ objects for specified functions, and the whole application can be efficiently parallelized just by defining a global constant as the parallelization factor. Moreover, the key hardware design elements, i. e., line buffers and sliding windows with different border handling patterns, can be utilized individually to ease the design of more complicated applications. This is the author’s version of the work. The definitive work was published in Proceedings of the International Workshop on FPGAs for Software Programmers (FSP) co-located with International Conference on Field Programmable Logic and Applications (FPL), Ghent, Belgium, September 7, 2017.

[1]  Jürgen Teich,et al.  Code generation from a domain-specific language for C-based HLS of hardware accelerators , 2014, 2014 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[2]  Jürgen Teich,et al.  Hardware design and analysis of efficient loop coarsening and border handling for image processing , 2017, 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[3]  Jürgen Teich,et al.  Loop coarsening in C-based High-Level Synthesis , 2015, 2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP).

[4]  Jürgen Teich,et al.  FPGA-based accelerator design from a domain-specific language , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).

[5]  Jürgen Teich,et al.  An image processing library for C-based high-level synthesis , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[6]  Xuan Yang,et al.  Programming Heterogeneous Systems from an Image Processing DSL , 2016, ACM Trans. Archit. Code Optim..

[7]  Uday Bondhugula,et al.  A DSL compiler for accelerating image processing pipelines on FPGAs , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).

[8]  Gary Smith,et al.  High-Level Synthesis: Past, Present, and Future , 2009, IEEE Design & Test of Computers.

[9]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[10]  Jürgen Teich,et al.  Auto-vectorization for image processing DSLs , 2017, LCTES.

[11]  Ilker Hamzaoglu,et al.  FPGA implementation of HEVC intra prediction using high-level synthesis , 2016, 2016 IEEE 6th International Conference on Consumer Electronics - Berlin (ICCE-Berlin).

[12]  Wim Vanderbauwhede,et al.  Comparison of Three Popular Parallel Programming Models on the Intel Xeon Phi , 2014, Euro-Par Workshops.

[13]  Jürgen Teich,et al.  Loop Parallelization Techniques for FPGA Accelerator Synthesis , 2018, J. Signal Process. Syst..