Mapping a class of dependence algorithms to coarse-grained reconfigurable arrays: architectural parameters and methodology

Existing compilation techniques for coarse-grained reconfigurable arrays are closely related to approaches from the DSP world. These approaches employ several loop transformations, like pipelining or temporal partitioning, but they are not able to exploit the full parallelism of a given algorithm and the computational potential of a typical 2-dimensional array. In this paper: we present an overview of constraints which have to be considered when mapping applications to coarse-grained reconfigurable arrays; we present our design methodology for mapping regular algorithms onto massively parallel arrays which is characterised by loop parallelisation in the polytope model; and, in a first case study, we adapt our design methodology for targeting reconfigurable arrays. The case study shows that the presented regular mapping methodology may lead to highly efficient implementations taking into account the constraints of the architecture.

[1]  Jürgen Teich,et al.  Design Space Exploration for Massively Parallel Processor Arrays , 2001, PaCT.

[2]  Jürgen Teich,et al.  Energy estimation of nested loop programs , 2002, SPAA '02.

[3]  Jürgen Teich,et al.  Synthesis of FPGA Implementations from Loop Algorithms , 2001 .

[4]  Fadi J. Kurdahi,et al.  MorphoSys: An Integrated Reconfigurable System for Data-Parallel and Computation-Intensive Applications , 2000, IEEE Trans. Computers.

[5]  Jürgen Teich,et al.  Automatic Synthesis of FPGA Processor Arrays from Loop Algorithms , 2004, The Journal of Supercomputing.

[6]  Jürgen Teich,et al.  Partitioning of processor arrays: a piecewise regular approach , 1993, Integr..

[7]  Nikil D. Dutt,et al.  SPARK: a high-level synthesis framework for applying parallelizing compiler transformations , 2003, 16th International Conference on VLSI Design, 2003. Proceedings..

[8]  Reiner W. Hartenstein,et al.  A decade of reconfigurable computing: a visionary retrospective , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[9]  B. Ramakrishna Rau,et al.  PICO: Automatically Designing Custom Computers , 2002, Computer.

[10]  F. Hannig,et al.  Boundary control: a new distributed control architecture for space-time transformed (VLSI) processor arrays , 2001, Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers (Cat.No.01CH37256).

[11]  Uwe Eckhardt,et al.  Hierarchical algorithm partitioning at system level for an improved utilization of memory structures , 1999, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[12]  André DeHon,et al.  Reconfigurable architectures for general-purpose computing , 1996 .

[13]  Markus Weinhardt,et al.  PACT XPP—A Self-Reconfigurable Data Processing Architecture , 2004, The Journal of Supercomputing.

[14]  Jürgen Teich,et al.  Resource constrained and speculative scheduling of an algorithm class with run-time dependent conditionals , 2004 .

[15]  Lothar Thiele,et al.  Resource constrained scheduling of uniform algorithms , 1993, J. VLSI Signal Process..

[16]  P. Dewilde,et al.  Computer systems and software engineering : state-of-the-art , 1992 .

[17]  Ed F. Deprettere,et al.  Domain-Specific Processors : Systems, Architectures, Modeling, and Simulation , 2003 .

[18]  Jürgen Teich,et al.  Scheduling of partitioned regular algorithms on processor arrays with constrained resources , 1996, Proceedings of International Conference on Application Specific Systems, Architectures and Processors: ASAP '96.

[19]  Wayne Luk,et al.  Pipeline vectorization , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[20]  Kiyoung Choi,et al.  An algorithm for mapping loops onto coarse-grained reconfigurable architectures , 2003 .

[21]  Bernard Pottier,et al.  Co-Design of Massively Parallel Embedded Processor Architectures , 2005, ReCoSoC.

[22]  Jürgen Teich,et al.  Dynamic Piecewise Linear/Regular Algorithms , 2004 .

[23]  Richard M. Karp,et al.  The Organization of Computations for Uniform Recurrence Equations , 1967, JACM.

[24]  Majid Sarrafzadeh,et al.  A quick safari through the reconfiguration jungle , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[25]  Jürgen Teich,et al.  Controller Synthesis for Mapping Partitioned Programs on Array Architectures , 2006, ARCS.

[26]  Richard C. Dorf,et al.  Field-Programmable Gate Arrays: Reconfigurable Logic for Rapid Prototyping and Implementation of Digital Systems , 1995 .

[27]  Jürgen Teich,et al.  Control generation in the design of processor arrays , 1991, J. VLSI Signal Process..

[28]  Ed F. Deprettere,et al.  Compaan: deriving process networks from Matlab for embedded signal processing architectures , 2000, CODES '00.

[29]  Jürgen Teich,et al.  Output Serialization for FPGA-based and Coarse-grained Processor Arrays , 2005, ERSA.