On the complexity of loop fusion

Loop fusion is a program transformation that combines several loops into one. It is used in parallelizing compilers mainly for increasing the granularity of loops and for improving data reuse. The goal of this paper is to study, from a theoretical point of view, several variants of the loop fusion problem-identifying polynomially solvable cases and NP-complete cases-and to make the link between these problems and some scheduling problems that arise from completely different areas. We study among others, the fusion of loops of different types, and the fusion of loops when combined with loop shifting.

[1]  Walid Abu-Sufah,et al.  Improving the performance of virtual memory computers. , 1979 .

[2]  Pierre Boulet,et al.  Loop Parallelization Algorithms: From Parallelism Extraction to Code Generation , 1998, Parallel Comput..

[3]  Michael F. P. O'Boyle,et al.  A compiler algorithm to reduce invalidation latency in virtual shared memory systems , 1996, Proceedings of the 1996 Conference on Parallel Architectures and Compilation Technique.

[4]  David Maier,et al.  The Complexity of Some Problems on Subsequences and Supersequences , 1978, JACM.

[5]  Esko Ukkonen,et al.  The Shortest Common Supersequence Problem over Binary Alphabet is NP-Complete , 1981, Theor. Comput. Sci..

[6]  Ken Kennedy,et al.  Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution , 1993, LCPC.

[7]  Vivek Sarkar,et al.  Optimal weighted loop fusion for parallel programs , 1997, SPAA '97.

[8]  Martin Middendorf More on the Complexity of Common Superstring and Supersequence Problems , 1994, Theor. Comput. Sci..

[9]  Ken Kennedy,et al.  Loop fusion in high performance Fortran , 1998, ICS '98.

[10]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[11]  Michael Wolfe,et al.  High performance compilers for parallel computing , 1995 .

[12]  Ii C. D. Callahan A global approach to detection of parallelism , 1987 .

[13]  Leon F. McGinnis,et al.  Routing Printed Circuit Cards Through an Assembly Cell , 1991, Oper. Res..

[14]  KennedyKen,et al.  Automatic translation of FORTRAN programs to vector form , 1987 .

[15]  Ken Kennedy,et al.  Automatic translation of FORTRAN programs to vector form , 1987, TOPL.

[16]  V. Sarkar,et al.  Collective Loop Fusion for Array Contraction , 1992, LCPC.

[17]  Ken Kennedy,et al.  Typed Fusion with Applications to Parallel and Sequential Code Generation , 1994 .

[18]  Alain Darte On the Complexity of Loop Fusion , 2000, Parallel Comput..

[19]  Michael F. P. O'Boyle,et al.  Compiler Reduction of Invalidation Traffic in Virtual Shared Memory Systems , 1996, Euro-Par, Vol. I.