Scheduling non-uniform parallel loops on distributed memory machines

A distributed self-scheduling scheme (DSSS) to schedule parallel loops with variable length iteration execution times on distributed memory machines is presented. DSSS combines static and dynamic scheduling and draws advantages from both. State scheduling reduces scheduling overhead and dynamic scheduling balances the workload. Data distribution is partially solved since a major portion of the iterations are scheduled statically. For data needed in the dynamic scheduling phase, duplication of data minimizes data movement. DSSS and other well-known self-scheduling schemes were implemented on a 64 processor nCUBE/7. Experiments showed that DSSS performed well on parallel loops with different characteristics.<<ETX>>

[1]  Peiyi Tang,et al.  Impact of self-scheduling order on performance on multiprocessor systems , 1988, ICS '88.

[2]  Edith Schonberg,et al.  Factoring: a method for scheduling parallel loops , 1992 .

[3]  Laxmikant V. Kalé,et al.  Supporting Machine Independent Programming on Diverse Parallel Architectures , 1991, ICPP.

[4]  Chien-Min Wang,et al.  Efficient Processor Assignment Algorithms and Loop Transformations for Executing Nested Parallel Loops on Multiprocessors , 1992, IEEE Trans. Parallel Distributed Syst..

[5]  Ronald L. Graham,et al.  Bounds on multiprocessing anomalies and related packing algorithms , 1972, AFIPS '72 (Spring).

[6]  M. Wolfe,et al.  Massive parallelism through program restructuring , 1990, [1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation.

[7]  Constantine D. Polychronopoulos,et al.  An efficient message-passing scheduler based on guided self scheduling , 1989, ICS '89.

[8]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .

[9]  V. A. Saletore A Distributed and Adaptive Dynamic Load Balancing Scheme for Parallel Processing of Medium-Grain Tasks , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[10]  Ron Cytron Doacross: Beyond Vectorization for Multiprocessors , 1986, ICPP.

[11]  Sartaj Sahni Scheduling Multipipeline and Multiprocessor Computers , 1984, IEEE Transactions on Computers.

[12]  Edward G. Coffman,et al.  Computer and job-shop scheduling theory , 1976 .

[13]  CONSTANTINE D. POLYCHRONOPOULOS,et al.  Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.

[14]  Michael J. Quinn,et al.  Designing Efficient Algorithms for Parallel Computers , 1987 .

[15]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .