Domain-specific augmentations for High-Level Synthesis

High-Level Synthesis (HLS) has become a very popular instrument to facilitate rapid development of production-ready implementations for FPGAs. Ever increasing flexibility of the frameworks, however, demands a very high level of domain-specific knowledge from the designer. Examples for such knowledge in window-based image processing are median computation and border handling. Depending on the size of the considered window, writing the code to perform such operations may become overwhelming even at very high abstraction levels. To increase productivity and to make the underlying architecture accessible to non-experts, we propose to combine HLS with domain-specific augmentations. Specifically, we propose a new language extension in form of a reduction for sorting and median computation. Furthermore, we introduce a new high-level transformation to perform multiple kinds of border treatment automatically. Both augmentations may reduce the required amount of code lines considerably. The increase in productivity is analyzed by comparing the lines of code necessary to specify a median filter for HLS in PAULA for synthesis using PARO and in C++ for synthesis using a commercial HLS tool.

[1]  Muhsen Owaida,et al.  Synthesis of Platform Architectures from OpenCL Programs , 2011, 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines.

[2]  James Coole,et al.  Fast, Flexible High-Level Synthesis from OpenCL using Reconfiguration Contexts , 2014, IEEE Micro.

[3]  Martin Odersky,et al.  Making domain-specific hardware synthesis tools cost-efficient , 2013, 2013 International Conference on Field-Programmable Technology (FPT).

[4]  Jürgen Teich,et al.  PARO: Synthesis of Hardware Accelerators for Multi-Dimensional Dataflow-Intensive Applications , 2008, ARC.

[5]  Frank Hannig,et al.  Scheduling Techniques for High-Throughput Loop Accelerators , 2009 .

[6]  Maya Gokhale,et al.  Trident: From High-Level Language to Hardware Circuitry , 2007, Computer.

[7]  Scott A. Mahlke,et al.  Optimus: efficient realization of streaming applications on FPGAs , 2008, CASES '08.

[8]  Jingling Xue,et al.  Loop Tiling for Parallelism , 2000, Kluwer International Series in Engineering and Computer Science.

[9]  Satnam Singh,et al.  Kiwi: Synthesis of FPGA Circuits from Parallel Programs , 2008, 2008 16th International Symposium on Field-Programmable Custom Computing Machines.

[10]  Nikil D. Dutt,et al.  SPARK: a high-level synthesis framework for applying parallelizing compiler transformations , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[11]  Adrian Park,et al.  Designing Modular Hardware Accelerators in C with ROCCC 2.0 , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[12]  Joshua S. Auerbach,et al.  Lime: a Java-compatible and synthesizable language for heterogeneous architectures , 2010, OOPSLA.

[13]  Jason Helge Anderson,et al.  LegUp: high-level synthesis for FPGA-based processor/accelerator systems , 2011, FPGA '11.

[14]  Arvind,et al.  Hands-on Introduction to Bluespec System Verilog (BSV) , 2008, 2008 6th ACM/IEEE International Conference on Formal Methods and Models for Co-Design.

[15]  Gang Wang,et al.  ACM/SIGDA International Symposium on Field Programmable Gate Arrays - FPGA , 2005 .

[16]  Paul Feautrier,et al.  Polyhedron Model , 2011, Encyclopedia of Parallel Computing.

[17]  Thomas Kailath,et al.  Regular iterative algorithms and their implementation on processor arrays , 1988, Proc. IEEE.

[18]  Franz Franchetti,et al.  Computer Generation of Hardware for Linear Digital Signal Processing Transforms , 2012, TODE.

[19]  Jason Cong,et al.  FCUDA: Enabling efficient compilation of CUDA kernels onto FPGAs , 2009, 2009 IEEE 7th Symposium on Application Specific Processors.

[20]  Eric Senn,et al.  ∂ GAUT: A High-Level Synthesis Tool for DSP applications , 2008 .