论文信息 - Coarse-Grained Loop Parallelization: Iteration Space Slicing vs Affine Transformations

Coarse-Grained Loop Parallelization: Iteration Space Slicing vs Affine Transformations

Automatic coarse-grained parallelization of program loops is of great importance for multi-core computing systems. This paper presents a comparison of Iteration SpaceSlicing and Affine Transformation Framework algorithms aimed at extracting coarse-grained parallelism available in arbitrarily nested parameterized affine loops. We demonstrate that Iteration Space Slicing permits for extracting more coarse-grained parallelism in comparison to the Affine Transformation Framework. Experimental results show that by means of Iteration SpaceSlicing algorithms, we are able to extract coarse-grained parallelism for most loops of the NAS and UTDSP benchmarks, and that there is a strong need in devising advanced algorithms for calculating the exact transitive closure of dependence relations in order to increase the applicability of that framework.

Albert Cohen | Anna Beletska | Marek Palkowski | Wlodzimierz Bielecki | Krzysztof Siedlecki

[1] Pierluigi San Pietro,et al. Finding Synchronization-Free Slices of Operations in Arbitrarily Nested Loops , 2008, ICCSA.

[2] Pierre Boulet,et al. Loop Parallelization Algorithms: From Parallelism Extraction to Code Generation , 1998, Parallel Comput..

[3] Albert Cohen,et al. Putting Polyhedral Loop Transformations to Work , 2003, LCPC.

[4] William Pugh,et al. The Omega Library interface guide , 1995 .

[5] Pierluigi San Pietro,et al. Extracting Coarse-Grained Parallelism in Program Loops with the Slicing Framework , 2007, Sixth International Symposium on Parallel and Distributed Computing (ISPDC'07).

[6] Pierluigi San Pietro,et al. Finding Synchronization-Free Parallelism Represented with Trees of Dependent Operations , 2008, ICA3PP.

[7] William Pugh,et al. Iteration space slicing and its application to communication optimization , 1997, ICS '97.

[8] Corinne Ancourt,et al. Scanning polyhedra with DO loops , 1991, PPOPP '91.

[9] Mark Weiser,et al. Program Slicing , 1981, IEEE Transactions on Software Engineering.

[10] Paul Feautrier. Toward Automatic Distribution , 1994, Parallel Process. Lett..

[11] Cédric Bastoul,et al. Code generation in the polyhedral model is easier than you think , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[12] Monica S. Lam,et al. An affine partitioning algorithm to maximize parallelism and minimize communication , 1999, ICS '99.

[13] Wlodzimierz Bielecki,et al. Extracting Synchronization-free chains of dependent iterations in non-uniform loops , 2007 .

[14] Paul Feautrier,et al. Some efficient solutions to the affine scheduling problem. Part II. Multidimensional time , 1992, International Journal of Parallel Programming.

[15] Yves Robert,et al. Scheduling and Automatic Parallelization , 2000, Birkhäuser Boston.

[16] FeautrierPaul. Some efficient solutions to the affine scheduling problem , 1992 .

[17] Monica S. Lam,et al. Communication-Free Parallelization via Affine Transformations , 1994, LCPC.

[18] Albert Cohen,et al. Polyhedral Code Generation in the Real World , 2006, CC.

[19] Paul Feautrier,et al. Some efficient solutions to the affine scheduling problem. I. One-dimensional time , 1992, International Journal of Parallel Programming.

[20] Sanjay V. Rajopadhye,et al. Generation of Efficient Nested Loops from Polyhedra , 2000, International Journal of Parallel Programming.

[21] Marek Palkowski,et al. Using Message Passing for Developing Coarse-Grained Applications in OpenMP , 2008, ICSOFT.

[22] William Pugh,et al. An Exact Method for Analysis of Value-based Array Data Dependences , 1993, LCPC.