A practical scheduling scheme for non-uniform parallel loops on distributed memory parallel machines

Loops without dependences among iterations are a rich source of parallelism in many applications. Among these types of loops, non-uniform loops with variable execution times need efficient scheduling schemes to take advantages of the capabilities of parallel machines. We present a global distributed control scheme (GDC) to schedule nonuniform loops on distributed memory parallel machines. GDC decentralizes scheduling controls among all processors with an attempt to keep heavily loaded processors being in charge of scheduling tasks. For comparative evaluation, GDC and other well-known scheduling schemes are implemented on a 512 processor Intel Delta parallel machine. Our experimental results show that GDC performs well on many applications with different characteristics.

[1]  Edith Schonberg,et al.  Factoring: a practical and robust method for scheduling parallel loops , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[2]  William H. Press,et al.  Numerical Recipes in FORTRAN - The Art of Scientific Computing, 2nd Edition , 1987 .

[3]  L.M. Ni,et al.  Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers , 1993, IEEE Trans. Parallel Distributed Syst..

[4]  Kai Li,et al.  Shared virtual memory on loosely coupled multiprocessors , 1986 .

[5]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[6]  CONSTANTINE D. POLYCHRONOPOULOS,et al.  Guided Self-Scheduling: A Practical Scheduling Scheme for Parallel Supercomputers , 1987, IEEE Transactions on Computers.

[7]  Jie Liu,et al.  Scheduling non-uniform parallel loops on distributed memory machines , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.

[8]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[9]  Eric Hoines,et al.  A Proposal for Standard Graphics Environments , 1987, IEEE Computer Graphics and Applications.

[10]  Cauligi S. Raghavendra,et al.  Load balancing strategies for ray tracing on parallel processors , 1994, Proceedings of TENCON'94 - 1994 IEEE Region 10's 9th Annual International Conference on: 'Frontiers of Computer Technology'.

[11]  Derek J. Paddon,et al.  Exploiting coherence for multiprocessor ray tracing , 1989, IEEE Computer Graphics and Applications.

[12]  Constantine D. Polychronopoulos,et al.  An efficient message-passing scheduler based on guided self scheduling , 1989, ICS '89.

[13]  Alan Weiss,et al.  Allocating Independent Subtasks on Parallel Processors , 1985, IEEE Transactions on Software Engineering.

[14]  K. A. Teague,et al.  The Hypercube Ray Tracer , 1990, Proceedings of the Fifth Distributed Memory Computing Conference, 1990..

[15]  Evangelos P. Markatos,et al.  Using processor affinity in loop scheduling on shared-memory multiprocessors , 1992, Supercomputing '92.

[16]  J. Salmon,et al.  A mathematical analysis of the scattered decomposition , 1988, C3P.

[17]  Edith Schonberg,et al.  Factoring: a method for scheduling parallel loops , 1992 .