Extracting Coarse-Grained Parallelism in Program Loops with the Slicing Framework

A novel approach for extracting coarse-grained parallelism being represented with independent and synchronization-requiring slices is presented. Each slice is composed of dependent iterations of perfectly nested loops. Presented algorithms work for both uniform and non-uniform loops. Our approach, based on operations on relations and sets, requires exact dependence analysis. Examples illustrating the proposed algorithm and results of experiments are presented.

[1]  Monica S. Lam,et al.  Blocking and array contraction across arbitrarily nested loops using affine partitioning , 2001, PPoPP '01.

[2]  Monica S. Lam,et al.  Communication optimization and code generation for distributed memory machines , 1993, PLDI '93.

[3]  Monica S. Lam,et al.  An affine partitioning algorithm to maximize parallelism and minimize communication , 1999, ICS '99.

[4]  Paul Feautrier,et al.  Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time , 1992, International Journal of Parallel Programming.

[5]  Martin Griebl,et al.  Classifying Loops for Space-Time Mapping , 1996, Euro-Par, Vol. I.

[6]  Paul Feautrier,et al.  Some efficient solutions to the affine scheduling problem. I. One-dimensional time , 1992, International Journal of Parallel Programming.

[7]  Paul Feautrier Toward Automatic Distribution , 1994, Parallel Process. Lett..

[8]  Allen,et al.  Optimizing Compilers for Modern Architectures , 2004 .

[9]  Albert Cohen,et al.  Polyhedral Code Generation in the Real World , 2006, CC.

[10]  Eduard Ayguadé Parra,et al.  Obtaining synchronization-free code with maximum parallelism , 1996 .

[11]  P. Feautrier Some Eecient Solutions to the Aane Scheduling Problem Part Ii Multidimensional Time , 1992 .

[12]  Monica S. Lam,et al.  Communication-Free Parallelization via Affine Transformations , 1994, LCPC.

[13]  Michael E. Wolf,et al.  Improving locality and parallelism in nested loops , 1992 .

[14]  Albert Cohen,et al.  Putting Polyhedral Loop Transformations to Work , 2003, LCPC.

[15]  Cédric Bastoul,et al.  Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[16]  Albert Cohen,et al.  A Polyhedral Approach to Ease the Composition of Program Transformations , 2004, Euro-Par.

[17]  Pierre Boulet,et al.  Loop Parallelization Algorithms: From Parallelism Extraction to Code Generation , 1998, Parallel Comput..

[18]  Yves Robert,et al.  Scheduling and Automatic Parallelization , 2000, Birkhäuser Boston.

[19]  David W. Binkley,et al.  Program slicing , 2008, 2008 Frontiers of Software Maintenance.

[20]  Sanjay V. Rajopadhye,et al.  Generation of Efficient Nested Loops from Polyhedra , 2000, International Journal of Parallel Programming.

[21]  William Pugh,et al.  Transitive Closure of Infinite Graphs and its Applications , 1995, Int. J. Parallel Program..

[22]  William Pugh,et al.  The Omega Library interface guide , 1995 .

[23]  William Pugh,et al.  Minimizing communication while preserving parallelism , 1996, ICS '96.

[24]  William Pugh,et al.  Iteration space slicing and its application to communication optimization , 1997, ICS '97.

[25]  Corinne Ancourt,et al.  Scanning polyhedra with DO loops , 1991, PPOPP '91.

[26]  Mark David Weiser,et al.  Program slices: formal, psychological, and practical investigations of an automatic program abstraction method , 1979 .

[27]  P. Sadayappan,et al.  Communication-Free Hyperplane Partitioning of Nested Loops , 1991, LCPC.

[28]  William Pugh,et al.  Constraint-based array dependence analysis , 1998, TOPL.