Run-Time Parallelization and Scheduling of Loops

The authors study run-time methods to automatically parallelize and schedule iterations of a do loop in certain cases where compile-time information is inadequate. The methods presented involve execution time preprocessing of the loop. At compile-time, these methods set up the framework for performing a loop dependency analysis. At run-time, wavefronts of concurrently executable loop iterations are identified. Using this wavefront information, loop iterations are reordered for increased parallelism. The authors utilize symbolic transformation rules to produce: inspector procedures that perform execution time preprocessing, and executors or transformed versions of source code loop structures. These transformed loop structures carry out the calculations planned in the inspector procedures. The authors present performance results from experiments conducted on the Encore Multimax. These results illustrate that run-time reordering of loop indexes can have a significant impact on performance. >

[1]  Joel H. Saltz,et al.  The Preprocessed Doacross Loop , 1991, ICPP.

[2]  Doug Baxter,et al.  Preconditioned Krylov Solvers and Methods for Runtime Loop Parallelization , 1988 .

[3]  Joel H. Saltz,et al.  Methods for Automated Problem Mapping , 1988 .

[4]  Martin H. Schultz,et al.  Numerical Algorithms for Modern Parallel Computer Architectures , 1988 .

[5]  James H. Patterson,et al.  Portable Programs for Parallel Processors , 1987 .

[6]  David A. Padua,et al.  High-Speed Multiprocessors and Compilation Techniques , 1980, IEEE Transactions on Computers.

[7]  David A. Padua,et al.  Advanced compiler optimizations for supercomputers , 1986, CACM.

[8]  Joel H. Saltz,et al.  Aggregation Methods for Solving Sparse Triangular Systems on Multiprocessors , 1990, SIAM J. Sci. Comput..

[9]  Joel H. Saltz,et al.  Principles of runtime support for parallel processors , 1988, ICS '88.

[10]  Joel H. Saltz,et al.  Optimal pre-scheduling of problem remappings , 1987 .

[11]  P. Sadayappan,et al.  An approach to synchronization for parallel computing , 1988, ICS '88.

[12]  Joel H. Saltz,et al.  Run-time parallelization and scheduling of loops , 1989, SPAA '89.

[13]  CONSTANTINE D. POLYCHRONOPOULOS,et al.  Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.

[14]  Ken Kennedy,et al.  Automatic decomposition of scientific programs for parallel execution , 1987, POPL '87.

[15]  Charles Koelbel,et al.  Supporting shared data structures on distributed memory architectures , 1990, PPOPP '90.

[16]  Harry F. Jordan Performance measurements on HEP - a pipelined MIMD computer , 1983, ISCA '83.

[17]  S. Eisenstat,et al.  An experimental study of methods for parallel preconditioned Krylov methods , 1989, C3P.

[18]  Dennis Gannon,et al.  On the problem of optimizing data transfers for complex memory systems , 1988, ICS '88.

[19]  Harry Berryman,et al.  Run-Time Scheduling and Execution of Loops on Message Passing Machines , 1990, J. Parallel Distributed Comput..

[20]  Ron Cytron,et al.  Doacross: Beyond Vectorization for Multiprocessors , 1986, ICPP.

[21]  Yousef Saad,et al.  Solving Sparse Triangular Linear Systems on Parallel Computers , 1989, Int. J. High Speed Comput..