Practical loop generation

This paper describes the integration of a formal loop generation technique into an auto-parallelizing compiler, MARS. A brief survey of loop generation techniques is given and is followed by the description of the loop generation strategy employed in our implementation. We describe the necessary input and output representations required for formal loop generation and describe how such a transformation fits into a complete compiler strategy. Given MARS' extended linear algebraic program representation and the constraints of a global compiler strategy, we have successfully integrated a formal tool into a FORTRAN compiler and have shown that combined they can outperform an existing commercial compiler.

[1]  Michael F. P. O'Boyle,et al.  Synchronization Minimization in a SPMD Execution Model , 1995, J. Parallel Distributed Comput..

[2]  Monica S. Lam,et al.  A Loop Transformation Theory and an Algorithm to Maximize Parallelism , 1991, IEEE Trans. Parallel Distributed Syst..

[3]  Christian Lengauer,et al.  Unimodularity Considered Non-Essential , 1992, CONPAR.

[4]  George B. Dantzig,et al.  Fourier-Motzkin Elimination and Its Dual , 1973, J. Comb. Theory, Ser. A.

[5]  Monica S. Lam,et al.  Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.

[6]  Corinne Ancourt,et al.  Scanning polyhedra with DO loops , 1991, PPOPP '91.

[7]  Keshav Pingali,et al.  A Singular Loop Transformation Framework Based on Non-Singular Matrices , 1992, LCPC.

[8]  Zbigniew Chamski How Efficient Can Loop Generation Be? , 1995, PARCO.

[9]  P. Feautrier Parametric integer programming , 1988 .

[10]  Paul Feautrier,et al.  Construction of Do Loops from Systems of Affine Constraints , 1995, Parallel Process. Lett..

[11]  Zbigniew Chamski,et al.  Fast and Efficient Generation of Loop Bounds , 1993, PARCO.

[12]  Doran Wilde,et al.  Loop nest synthesis using the polyhedral library , 1994 .

[13]  Ken Kennedy,et al.  Compilation techniques for block-cyclic distributions , 1994 .

[14]  William Pugh,et al.  A practical algorithm for exact array dependence analysis , 1992, CACM.

[15]  Zbigniew Chamski,et al.  Beyond convexity: scanning 'non-convex polyhedra' , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[16]  F. Bodin,et al.  Fortran-S: a Fortran interface for shared virtual memory architectures , 1993, Supercomputing '93.

[17]  Marc Le Fur Parcours de polyèdre paramétré avec l'élimination de Fourier-Motzkin , 1994 .

[18]  Zbigniew Chamski,et al.  Nested loop sequences: towards efficient loop structures in automatic parallelization , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[19]  J. Ramanujam,et al.  Non-unimodular transformations of nested loops , 1992, Proceedings Supercomputing '92.

[20]  Paul Feautrier Semantical Analysis and Mathematical Programming Application to Parallelization and Vectorization , 1989 .