Measuring the effectiveness of automatic parallelization in SUIF

This paper presents both an experiment and a system for inserting run-time dependence and privatization testing. The goal of the experiment is to measure empirically the remaining opportunities for exploiting loop-level parallelism that are missed by state-of-theart parallelizing compiler technology. We perform run-time testing of data accessed within all candidate loops not parallelized by the compiler to identify which of these loops could safely execute in parallel for the given program input. This system extends the Lazy Privatizing Doall (LPD) test to simultaneously instrument multiple loops in a nest. Using the results of interprocedural array dataflow analysis, we avoid unnecessary instrumentation of arrays with compile-time provable dependences or loops nested inside outer parallelized loops. We have implemented the system in the Stanford SUIF compiler and have measured programs in three benchmark suites.

[1]  Zhiyuan Li,et al.  Symbolic Array Dataflow Analysis for Array Privatization and Program Parallelization , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[2]  Lawrence Rauchwerger,et al.  The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization , 1995, PLDI '95.

[3]  Yunheung Paek,et al.  Parallel Programming with Polaris , 1996, Computer.

[4]  William Pugh,et al.  Eliminating false data dependences using the Omega test , 1992, PLDI '92.

[5]  Sungdo Moon,et al.  Predicated array data-flow analysis for run-time parallelization , 1998, ICS '98.

[6]  Peng Tu,et al.  Automatic array privatization and demand-driven symbolic analysis , 1996 .

[7]  Geoffrey C. Fox,et al.  Runtime Support and Compilation Methods for User-Specified Irregular Data Distributions , 1995, IEEE Trans. Parallel Distributed Syst..

[8]  Ken Kennedy Practical techniques to augment dependence analysis in the presence of symbolic terms , 1997 .

[9]  Mary W. Hall,et al.  Detecting Coarse - Grain Parallelism Using an Interprocedural Parallelizing Compiler , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[10]  Saman Amarasinghe,et al.  Parallelizing Compiler Techniques Based on Linear Inequalities , 1997 .

[11]  Ken Kennedy,et al.  The ParaScope parallel programming environment , 1993, Proc. IEEE.

[12]  Ken Kennedy,et al.  A Methodology for Procedure Cloning , 1993, Computer languages.

[13]  Pierre Jouvelot,et al.  Semantical interprocedural parallelization: an overview of the PIPS project , 1991 .

[14]  Monica S. Lam,et al.  Maximizing Multiprocessor Performance with the SUIF Compiler , 1996, Digit. Tech. J..

[15]  Thomas R. Lawrence,et al.  IMPLEMENTATION OF RUN TIME TECHNIQUES IN THE POLARIS FORTRAN RESTRUCTURER , 1996 .

[16]  Monica S. Lam,et al.  SUIF Explorer: A programming assistant for parallel machines , 1997 .

[17]  John M. Mellor-Crummey,et al.  Compile-time support for efficient data race detection in shared-memory parallel programs , 1993, PADD '93.

[18]  Monica S. Lam,et al.  Interprocedural Analysis for Parallelization , 1995, LCPC.

[19]  Kathryn S. McKinley,et al.  Automatic and interactive parallelization , 1992 .

[20]  Ken Kennedy,et al.  Parascope:a Parallel Programming Environment , 1988 .

[21]  Mellor-CrummeyJohn,et al.  Compile-time support for efficient data race detection in shared-memory parallel programs , 1993 .

[22]  Rudolf Eigenmann,et al.  Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs , 1992, IEEE Trans. Parallel Distributed Syst..