Nested loop sequences: towards efficient loop structures in automatic parallelization

An important problem in automatic parallelization of scientific programs is to generate loops from an algebraic description of the iteration domain. The usual technique is to produce a perfectly nested set of loops, whose bounds consist in maxima and minima of several affine functions. However, perfect loop nests suffer from the run-time overhead of evaluating bound expressions and do not allow to scan non-convex domains efficiently. We study a candidate loop structure for overcoming these problems. This structure, called nested loop sequence (NLS,) is defined as a sequence of DO loops whose bodies ore nonempty sequences of DO loops. We propose on algorithm to compute a NLS scanning a given convex polyhedron, which overcomes the run-time overhead problem. The algorithm produces a loop structure in which the bounds of every loop consist each in a single affine function.<<ETX>>