Code Generation for High-Level Synthesis of Multiresolution Applications on FPGAs

Multiresolution Analysis (MRA) is a mathematical method that is based on working on a problem at different scales. One of its applications is medical imaging where pro- cessing at multiple scales—based on the concept of Gaussian and Laplacian image pyramids—is a well-known technique. It is often applied to reduce noise while preserving image detail on different levels of granularity without modifying the filter kernel. In scientific computing, multigrid methods are a popular choice, as they are asymptotically optimal solvers for elliptic Partial Differential Equations (PDEs). As such algorithms have a very high computational complexity that would overwhelm CPUs in the presence of real-time constraints, application-specific processors come into consideration for implementation. Despite of huge advancements in leveraging productivity in the respective fields, designers are still required to have detailed knowledge about coding techniques and the targeted architecture to achieve efficient solutions. Recently, the HIPA cc framework was proposed as a means for automatic code generation of image processing algorithms, based on a Domain-Specific Language (DSL). From the same code base, it is possible to generate code for efficient implementations on several accelerator technologies including different types of Graphics Processing Units (GPUs) as well as reconfigurable logic (FPGAs). In this work, we demonstrate the ability of HIPA cc to generate code for the implementation of multiresolution applications on FPGAs and embedded GPUs. I. INTRODUCTION

[1]  Helmar Burkhart,et al.  PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[2]  Jürgen Teich,et al.  Domain-specific augmentations for High-Level Synthesis , 2014, 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors.

[3]  Walid A. Najjar,et al.  Input data reuse in compiling window operations onto reconfigurable hardware , 2004, LCTES '04.

[4]  Jürgen Teich,et al.  Generating Device-specific GPU Code for Local Operators in Medical Imaging , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[5]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[6]  Jürgen Teich,et al.  Towards Domain-Specific Computing for Stencil Codes in HPC , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.

[7]  Jürgen Teich,et al.  Code generation from a domain-specific language for C-based HLS of hardware accelerators , 2014, 2014 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[8]  Jürgen Teich,et al.  ExaStencils: Advanced Stencil-Code Engineering , 2014, Euro-Par Workshops.

[9]  Edward H. Adelson,et al.  The Laplacian Pyramid as a Compact Image Code , 1983, IEEE Trans. Commun..

[10]  Jürgen Teich,et al.  A deeply pipelined and parallel architecture for denoising medical images , 2010, 2010 International Conference on Field-Programmable Technology.

[11]  Jürgen Teich,et al.  Code generation for embedded heterogeneous architectures on android , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[12]  Pat Hanrahan,et al.  Darkroom , 2014, ACM Trans. Graph..

[13]  Bradley C. Kuszmaul,et al.  The pochoir stencil compiler , 2011, SPAA '11.

[14]  Jürgen Teich,et al.  PARO: Synthesis of Hardware Accelerators for Multi-Dimensional Dataflow-Intensive Applications , 2008, ARC.

[15]  Jürgen Teich,et al.  An image processing library for C-based high-level synthesis , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[16]  Jürgen Teich,et al.  Towards a performance-portable description of geometric multigrid algorithms using a domain-specific language , 2014, J. Parallel Distributed Comput..