Parallel Loop Scheduling for High Performance Computers

Executing loop iterations in parallel on a multiprocessor system is one of the many ways to improve the execution of a program. However, due to the scheduling overhead and the potential for load imbalance among processors, maximum performance might not be attained. This article reviews current loop scheduling algorithms and studies their scheduling overhead versus load balancing tradeoffs. Using analytical models, simulations, and experimental measurements, the performance and the scalability of chunk scheduling, self-scheduling, guided self-scheduling, factoring, and trapezoid self-scheduling are compared.

[1]  Multiprocessors Using Processor A � nity in Loop Scheduling on Shared Memory , 1994 .

[2]  L.M. Ni,et al.  Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers , 1993, IEEE Trans. Parallel Distributed Syst..

[3]  John K. Ousterhout Scheduling Techniques for Concurrebt Systems. , 1982, ICDCS 1982.

[4]  CONSTANTINE D. POLYCHRONOPOULOS,et al.  Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.

[5]  Alexandru Nicolau,et al.  Measuring the Parallelism Available for Very Long Instruction Word Architectures , 1984, IEEE Transactions on Computers.

[6]  David J. Lilja Exploiting the parallelism available in loops , 1994, Computer.

[7]  Ananth Grama,et al.  Isoefficiency Function: A Sealability Metric for Parallel Algorithms and Architectures , 1993 .

[8]  Evangelos P. Markatos,et al.  Using processor affinity in loop scheduling on shared-memory multiprocessors , 1992, Supercomputing '92.

[9]  Anoop Gupta,et al.  Process control and scheduling issues for multiprogrammed shared-memory multiprocessors , 1989, SOSP '89.

[10]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[11]  Alan Weiss,et al.  Allocating Independent Subtasks on Parallel Processors , 1985, IEEE Transactions on Software Engineering.

[12]  Ron Cytron,et al.  Doacross: Beyond Vectorization for Multiprocessors , 1986, ICPP.

[13]  Peiyi Tang,et al.  Dynamic Processor Self-Scheduling for General Parallel Nested Loops , 1987, IEEE Trans. Computers.

[14]  David J. Lilja,et al.  Parameter estimation for a generalized parallel loop scheduling algorithm , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[15]  Vijay P. Kumar,et al.  Analyzing Scalability of Parallel Algorithms and Architectures , 1994, J. Parallel Distributed Comput..

[16]  Edith Schonberg,et al.  Factoring: a method for scheduling parallel loops , 1992 .

[17]  Babak Hamidzadeh,et al.  Self-Adjusting Scheduling: An On-Line Optimization Technique for Locality Management and Load Balancing , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[18]  Bob Francis,et al.  Silicon Graphics Inc. , 1993 .