Synchronization Minimization in a SPMD Execution Model

This paper presents an algorithm for synchronization placement when using a SPMD execution model, where synchronizations are enforced only when there exists a cross-processor data dependence. In this paper, we investigate two scheduling techniques, loop-based and data-based, both of which use a SPMD model. Using scheduling information from previous stages in the compilation process, a new technique to determine potential cross-processor data dependences is presented. Given the minimum number of cross-processor data dependences that must be satisfied, a new optimization is used so as to minimize the number of synchronization points needed to satisfy them. This algorithm has been successfully implemented in an experimental compiler. Initial experimental data show this technique to be very effective, outperforming existing methods.

[1]  M. O'Boyle,et al.  Data alignment: transformations to reduce communication on distributed memory architectures , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..

[2]  Ken Kennedy,et al.  Automatic decomposition of scientific programs for parallel execution , 1987, POPL '87.

[3]  Jingke Li,et al.  Index domain alignment: minimizing cost of cross-referencing between distributed arrays , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.

[4]  Constantine D. Polychronopoulos,et al.  Parallel programming and compilers , 1988 .

[5]  Monica S. Lam,et al.  Global optimizations for parallelism and locality on scalable parallel machines , 1993, PLDI '93.

[6]  H. P. Williams THEORY OF LINEAR AND INTEGER PROGRAMMING (Wiley-Interscience Series in Discrete Mathematics and Optimization) , 1989 .

[7]  Kathryn S. McKinley Evaluating automatic parallelization for efficient execution on shared-memory multiprocessors , 1994, ICS '94.

[8]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[9]  Siegfried Benkner,et al.  Vienna Fortran 90 , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..

[10]  S. P. Midkiff Automatic generation of synchronization instructions for parallel processors , 1986 .

[11]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[12]  F. Bodin,et al.  Fortran-S: a Fortran interface for shared virtual memory architectures , 1993, Supercomputing '93.

[13]  Ii C. D. Callahan A global approach to detection of parallelism , 1987 .

[14]  Monica S. Lam,et al.  Communication optimization and code generation for distributed memory machines , 1993, PLDI '93.

[15]  Fanica Gavril,et al.  Algorithms for Minimum Coloring, Maximum Clique, Minimum Covering by Cliques, and Maximum Independent Set of a Chordal Graph , 1972, SIAM J. Comput..

[16]  Ken Kennedy,et al.  Optimizing for parallelism and data locality , 1992, ICS '92.

[17]  David A. Padua,et al.  Advanced compiler optimizations for supercomputers , 1986, CACM.

[18]  Thierry Priol,et al.  Overview of the KOAN programming environment for the iPSC/2 and performance evaluation of the BECAUSE test program 2.51 , 1994, Future Gener. Comput. Syst..

[19]  Michael F. P. O'Boyle A Data Partitioning Algorithm for Distributed Memory Compilation , 1994, PARLE.

[20]  Barbara M. Chapman,et al.  Supercompilers for parallel and vector computers , 1990, ACM Press frontier series.

[21]  Peiyi Tang,et al.  Reducing data communication overhead for DOACROSS loop nests , 1994, ICS '94.

[22]  Constantine Demetrios Polychronopoulos On program restructuring, scheduling, and communication for parallel processor systems , 1986 .

[23]  Jenq Kuen Lee,et al.  Sigma II: A Tool Kit for Building Parallelizing Compilers and Performance Analysis Systems , 1992, Programming Environments for Parallel Computing.

[24]  William Pugh,et al.  Eliminating false data dependences using the Omega test , 1992, PLDI '92.

[25]  David A. Padua,et al.  Compiler Algorithms for Synchronization , 1987, IEEE Transactions on Computers.

[26]  Yves Robert,et al.  Evaluating Array Expressions On Massively Parallel Machines With Communication/ Computation Overlap , 1994, Int. J. High Perform. Comput. Appl..

[27]  G.-R. Hoffmann,et al.  Aspects of Using Multiprocessors for Meteorological Modelling , 1988 .