Controller Synthesis for Mapping Partitioned Programs on Array Architectures

Processor arrays can be used as accelerators for a plenty of dataflow-dominant applications. Innately these applications have almost no control flow, but the application of sophisticated partitioning and scheduling techniques in order to handle large scale problems and to balance local memory requirements with I/O-bandwidth has the disadvantage of a more complex control flow. Thus, efficient control path synthesis is one of the greatest challenges when compiling algorithms onto processor arrays. This paper presents an efficient methodology for the automated control path synthesis for the mapping of partitioned algorithms onto processor arrays. The major advantages observed in the presented methodology are seen in, (a) control generation for different partitioning techniques and arbitrary parallelepiped tiles, (b) combined use of a global and a local control strategy in order to reduce the control overhead, (c) up to 90 percent reduction in control path area and resources compared to existing approaches.

[1]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[2]  Jingling Xue,et al.  The synthesis of control signals for one-dimensional systolic arrays , 1992, Integr..

[3]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[4]  H. P. Williams THEORY OF LINEAR AND INTEGER PROGRAMMING (Wiley-Interscience Series in Discrete Mathematics and Optimization) , 1989 .

[5]  Christian Lengauer,et al.  Loop Parallelization in the Polytope Model , 1993, CONCUR.

[6]  Sanjay V. Rajopadhye,et al.  Generation of Efficient Nested Loops from Polyhedra , 2000, International Journal of Parallel Programming.

[7]  Steven Derrien,et al.  Interfacing compiled FPGA programs: the MMAlpha approach , 2000, International Conference on Parallel and Distributed Processing Techniques and Applications.

[8]  Jürgen Teich,et al.  Scheduling of partitioned regular algorithms on processor arrays with constrained resources , 1996, Proceedings of International Conference on Application Specific Systems, Architectures and Processors: ASAP '96.

[9]  Jürgen Teich,et al.  Exact Partitioning of Affine Dependence Algorithms , 2002, Embedded Processor Design Challenges.

[10]  Jürgen Teich,et al.  Regular mapping for coarse-grained reconfigurable architectures , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  P. Feautrier Parametric integer programming , 1988 .

[12]  Frédéric Vivien,et al.  Constructing and exploiting linear schedules with prescribed parallelism , 2002, TODE.

[13]  Uwe Eckhardt,et al.  Hierarchical algorithm partitioning at system level for an improved utilization of memory structures , 1999, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[14]  Patrice Quinton,et al.  Hardware synthesis for multi-dimensional time , 2003, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 2003.

[15]  Cédric Bastoul,et al.  Efficient code generation for automatic parallelization and optimization , 2003, Second International Symposium on Parallel and Distributed Computing, 2003. Proceedings..

[16]  Yves Robert,et al.  Linear Scheduling Is Nearly Optimal , 1991, Parallel Process. Lett..

[17]  Jürgen Teich,et al.  Partitioning Processor Arrays under Resource Constraints , 1997, J. VLSI Signal Process..

[18]  Jürgen Teich,et al.  Control generation in the design of processor arrays , 1991, J. VLSI Signal Process..

[19]  Jürgen Teich,et al.  Design Space Exploration for Massively Parallel Processor Arrays , 2001, PaCT.

[20]  Jingling Xue Formal synthesis of control signals for systolic arrays , 1992 .

[21]  Richard C. Dorf,et al.  Field-Programmable Gate Arrays: Reconfigurable Logic for Rapid Prototyping and Implementation of Digital Systems , 1995 .