A design flow for speeding-up dsp applications in heterogeneous reconfigurable systems

In this paper, we propose a method for speeding-up Digital Signal Processing applications by partitioning them between the reconfigurable hardware blocks of different granularity and mapping critical parts of applications on coarse-grain reconfigurable hardware. The reconfigurable hardware blocks are embedded in a heterogeneous reconfigurable system architecture. The fine-grain part is implemented by an embedded FPGA unit, while for the coarse-grain reconfigurable hardware our developed high-performance coarse-grain data-path is used. The design flow mainly consists of three steps; the analysis procedure, the mapping onto coarse-grain blocks, and the mapping onto the fine-grain hardware. In this work, the methodology is validated using five real-life applications; an OFDM transmitter, a medical imaging technique, a wavelet-based image compressor, a video compression scheme and a JPEG encoder. The experimental results show that the speedup, relative to an all-FPGA solution, ranges from 1.55 to 4.17 for the considered applications.

[1]  Frank Vahid,et al.  Energy savings and speedups from partitioning critical software loops to hardware in embedded systems , 2004, TECS.

[2]  Masato Motomura,et al.  An Embedded DRAM-FPGA Chip With Instantaneous Logic Reconfiguration , 1997, Symposium 1997 on VLSI Circuits.

[3]  Peter M. Athanas,et al.  A run-time reconfigurable engine for image interpolation , 1998, Proceedings. IEEE Symposium on FPGAs for Custom Computing Machines (Cat. No.98TB100251).

[4]  Ranga Vemuri,et al.  An Automated Temporal Partitioning Tool for a class of DSP applications , 1998, PACT 1998.

[5]  George Varghese,et al.  Design Methodology of a Low-Energy Reconfigurable Single-Chip DSP System , 2001, J. VLSI Signal Process..

[6]  Reiner W. Hartenstein,et al.  A decade of reconfigurable computing: a visionary retrospective , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[7]  Nikil D. Dutt,et al.  Using global code motions to improve the quality of results for high-level synthesis , 2004, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[8]  A. Tsai,et al.  PipeRench: A virtualized programmable datapath in 0.18 micron technology , 2002, Proceedings of the IEEE 2002 Custom Integrated Circuits Conference (Cat. No.02CH37285).

[9]  Fadi J. Kurdahi,et al.  MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications , 2000, IEEE Trans. Computers.

[10]  Frank Vahid,et al.  Profiling tools for hardware/software partitioning of embedded applications , 2003, LCTES.

[11]  Reiner W. Hartenstein,et al.  Parallelization in co-compilation for configurable accelerators-a host/accelerator partitioning compilation method , 1998, Proceedings of 1998 Asia and South Pacific Design Automation Conference.

[12]  Majid Sarrafzadeh,et al.  Instruction generation for hybrid reconfigurable systems , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[13]  John Wawrzynek,et al.  The Garp Architecture and C Compiler , 2000, Computer.

[14]  Hideharu Amano,et al.  WASMII: a data driven computer on a virtual hardware , 1993, [1993] Proceedings IEEE Workshop on FPGAs for Custom Computing Machines.

[15]  Tony Mason,et al.  Lex & Yacc , 1992 .

[16]  Majid Sarrafzadeh,et al.  Instruction generation and regularity extraction for reconfigurable processors , 2002, CASES '02.

[17]  Frank Vahid,et al.  Energy Advantages of Microprocessor Platforms with On-Chip Configurable Logic , 2002, IEEE Des. Test Comput..

[18]  Gerard J. M. Smit,et al.  Mapping Wireless Communication Algorithms onto a Reconfigurable Architecture , 2004, The Journal of Supercomputing.

[19]  Frank Vahid,et al.  SpecSyn: an environment supporting the specify-explore-refine paradigm for hardware/software system design , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[20]  Spyros Tragoudas,et al.  Mapping Computational Intensive Applications to a New Coarse-Grained Reconfigurable Data-Path , 2004, PATMOS.

[21]  Giovanni De Micheli,et al.  Synthesis and Optimization of Digital Circuits , 1994 .

[22]  Ahmadreza Rofougaran,et al.  A 5-GHz direct-conversion CMOS transceiver utilizing automatic frequency control for the IEEE 802.11a wireless LAN standard , 2003, IEEE J. Solid State Circuits.

[23]  Jürgen Becker,et al.  Datapath and Compiler Integration of Coarse-grain Reconfigurable XPP-Arrays into Pipelined RISC Processors , 2003, VLSI-SOC.