A deeply pipelined and parallel architecture for denoising medical images

In this paper we present an almost automatic synthesis of a highly complex, throughput optimized architecture of an adaptive multiresolution filter as used in medical image processing for FPGAs. The filter consists of 16 parallel working modules, where the most computationally intensive module achieves software pipelining of a factor of 85, that is, computations of 85 iterations overlap each other. By applying a state-of-the-art high-level synthesis tool, we show that this approach can be used for real world applications. In addition, we show that our high-level synthesis tool is capable of significantly reducing the well known productivity gap of embedded system design by almost two orders of magnitude. Finally, we can conclude that the FPGA implementation of the multiresolution image processing algorithm is far ahead of a comparable implementation for graphics cards in terms of power efficiency.

[1]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Nikil D. Dutt,et al.  SPARK: a high-level synthesis framework for applying parallelizing compiler transformations , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[3]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[4]  Christian Haubelt,et al.  Model-based synthesis and optimization of static multi-rate image processing algorithms , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[5]  Xiaobo Sharon Hu,et al.  Expanding the Range of Convergence of the CORDIC Algorithm , 1991, IEEE Trans. Computers.

[6]  Rishiyur S. Nikhil,et al.  Bluespec System Verilog: efficient, correct RTL from high level specifications , 2004, Proceedings. Second ACM and IEEE International Conference on Formal Methods and Models for Co-Design, 2004. MEMOCODE '04..

[7]  Jürgen Teich,et al.  A Design Methodology for Hardware Acceleration of Adaptive Filter Algorithms in Image Processing , 2006, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06).

[8]  Til Aach,et al.  Nonlinear multiresolution gradient adaptive filter for medical images , 2003, SPIE Medical Imaging.

[9]  Jürgen Teich,et al.  PARO: Synthesis of Hardware Accelerators for Multi-Dimensional Dataflow-Intensive Applications , 2008, ARC.

[10]  Philippe Coussy,et al.  High-Level Synthesis: from Algorithm to Digital Circuit , 2008 .

[11]  Jason Cong,et al.  AutoPilot: A Platform-Based ESL Synthesis System , 2008 .

[12]  Aleksandra Pizurica,et al.  A Real-Time Wavelet-Domain Video Denoising Implementation in FPGA , 2006, EURASIP J. Embed. Syst..

[13]  B. Ramakrishna Rau,et al.  PICO: Automatically Designing Custom Computers , 2002, Computer.

[14]  Griselda Saldaña-González,et al.  FPGA-based customizable systolic architecture for image processing applications , 2005, 2005 International Conference on Reconfigurable Computing and FPGAs (ReConFig'05).

[15]  César Torres-Huitzil,et al.  FPGA-Based Configurable Systolic Architecture for Window-Based Image Processing , 2005, EURASIP J. Adv. Signal Process..

[16]  Suyash P. Awate,et al.  Unsupervised, information-theoretic, adaptive image filtering for image restoration , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  David J. Lau,et al.  Automated Generation of Hardware Accelerators with Direct Memory Access from ANSI/ISO Standard C Functions , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.