PARO: Synthesis of Hardware Accelerators for Multi-Dimensional Dataflow-Intensive Applications

In this paper, we present the PARO design tool for the automated hardware synthesis of massively parallel embedded architectures for given dataflow dominant applications. Key features of PARO are: (1) The design entry in form of a compact and intuitive functional programming language which allows highly parallel implementations. (2) Advanced partitioning techniques are applied in order to balance the trade-offs in cost and performance along with requisite throughputs. This is obtained by distributing computations onto an array of tightly coupled processor elements. (3) We demonstrate the performance of the FPGA synthesized hardware with several selected algorithms from different benchmarks.

[1]  Patrice Quinton,et al.  Hardware synthesis for multi-dimensional time , 2003, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 2003.

[2]  Jr. Hamilton Richards Haskell: The Craft of Functional Programming by Simon Thompson, Addison-Wesley, 1996. , 1998 .

[3]  Jürgen Teich,et al.  Efficient control generation for mapping nested loop programs onto processor arrays , 2007, J. Syst. Archit..

[4]  Jürgen Teich,et al.  Partitioning Processor Arrays under Resource Constraints , 1997, J. VLSI Signal Process..

[5]  Paul Pinella,et al.  Mentor Graphics Corp. , 1993 .

[6]  Nikil D. Dutt,et al.  SPARK: a high-level synthesis framework for applying parallelizing compiler transformations , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[7]  Jürgen Teich,et al.  Scheduling of partitioned regular algorithms on processor arrays with constrained resources , 1996, Proceedings of International Conference on Application Specific Systems, Architectures and Processors: ASAP '96.

[8]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[9]  Jürgen Teich,et al.  Hierarchical Partitioning for Piecewise Linear Algorithms , 2006, International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06).

[10]  Jürgen Teich,et al.  Resource constrained and speculative scheduling of an algorithm class with run-time dependent conditionals , 2004, Proceedings. 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors, 2004..

[11]  Jürgen Teich,et al.  Automatic FIR Filter Generation for FPGAs , 2005, SAMOS.

[12]  Simon Thompson,et al.  Haskell: The Craft of Functional Programming , 1996 .

[13]  Jürgen Teich,et al.  A Design Methodology for Hardware Acceleration of Adaptive Filter Algorithms in Image Processing , 2006, IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06).

[14]  Stamatis Vassiliadis,et al.  Embedded Computer Systems: Architectures, Modeling, and Simulation 5th International Workshop, SAMOS 2005, Samos, Greece, July 18-20, 2005, Proceedings , 2005, International Conference / Workshop on Embedded Computer Systems: Architectures, Modeling and Simulation.