High-Level Synthesis of Loops Using the Polyhedral Model

High-level synthesis (HLS) of loops allows efficient handling of intensive computations of an application, e.g. in signal processing. Unrolling loops, the classical technique used in most HLS tools, cannot produce regular parallel architectures which are often needed. In this Chapter, we present, through the example of the MMAlpha testbed, basic techniques which are at the heart of loop analysis and parallelization. We present here the point of view of the polyhedral model of loops, where iterative calculations are represented as recurrence equations on integral polyhedra. Illustrated from an example of string alignment, we describe the various transformations allowing HLS and we explain how these transformations can be merged in a synthesis flow.

[1]  Vincent Loechner,et al.  Precise Data Locality Optimization of Nested Loops , 2004, The Journal of Supercomputing.

[2]  Paul Feautrier,et al.  Dataflow analysis of array and scalar references , 1991, International Journal of Parallel Programming.

[3]  Jürgen Teich,et al.  Automatic Synthesis of FPGA Processor Arrays from Loop Algorithms , 2004, The Journal of Supercomputing.

[4]  Ed F. Deprettere,et al.  Efficient Automated Synthesis, Programing, and Implementation of Multi-Processor Platforms on FPGA Chips , 2006, 2006 International Conference on Field Programmable Logic and Applications.

[5]  Katell Morin-Allory,et al.  Verification of safety properties for parameterized regular systems , 2005, TECS.

[6]  Frédéric Vivien,et al.  Constructing and exploiting linear schedules with prescribed parallelism , 2002, TODE.

[7]  Michael Wolfe,et al.  A loop restructuring research tool , 1990 .

[8]  T. Risset,et al.  Structuration of the ALPHA language , 1995, Programming Models for Massively Parallel Computers.

[9]  Scott A. Mahlke,et al.  Streamroller:: automatic synthesis of prescribed throughput accelerator pipelines , 2006, Proceedings of the 4th International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS '06).

[10]  Frédéric Pétrot,et al.  Platform-based design from parallel C specifications , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Christian Lengauer,et al.  Loop Parallelization in the Polytope Model , 1993, CONCUR.

[12]  Dan I. Moldovan,et al.  Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays , 1986, IEEE Transactions on Computers.

[13]  Alexandru Turjan,et al.  System design using Khan process networks: the Compaan/Laura approach , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[14]  Pierre Jouvelot,et al.  Semantical interprocedural parallelization: an overview of the PIPS project , 1991 .

[15]  Scott A. Mahlke,et al.  PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators , 2002, J. VLSI Signal Process..

[16]  Gilles Villard,et al.  Lattice-Based Memory Allocation , 2005, IEEE Trans. Computers.

[17]  Francky Catthoor,et al.  Custom Memory Management Methodology , 1998, Springer US.

[18]  Pierre Boulet,et al.  Array-OL Revisited, Multidimensional Intensive Signal Processing Specification , 2007 .

[19]  Paul Feautrier,et al.  Some efficient solutions to the affine scheduling problem. I. One-dimensional time , 1992, International Journal of Parallel Programming.

[20]  Steven W. K. Tjiang,et al.  SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.

[21]  Susmita Sur-Kolay,et al.  Combined instruction and loop parallelism in array synthesis for FPGAs , 2001, International Symposium on System Synthesis (IEEE Cat. No.01EX526).

[22]  Alexandru Turjan,et al.  Translating affine nested-loop programs to process networks , 2004, CASES '04.

[23]  Albert Cohen,et al.  Putting Polyhedral Loop Transformations to Work , 2003, LCPC.

[24]  Patrice Quinton,et al.  The mapping of linear recurrence equations on regular arrays , 1989, J. VLSI Signal Process..