LLB: A fast and effective scheduling algorithm for distributed-memory systems

This paper presents a new algorithm called List-based Load Balancing (LLB) for compile-time task scheduling on distributed-memory machines. LLB is intended as a cluster-mapping and task-ordering step in the multi-step class of scheduling algorithms. Unlike current multistep approaches, LLB integrates cluster-mapping and task-ordering in a single step. The benefits of this integration are twofold. First, it allows dynamic load balancing in time, because only the ready tasks are considered in the mapping process. Second, communication is also considered, as opposed to algorithms like WCM and GLB. The algorithm has a low time complexity of O(E+V(log V+log P)), where E is the number of dependences, V is the number of tasks and P is the number of processors. Experimental results show that LLB outperforms known cluster-mapping algorithms of comparable complexity, improving the schedule lengths up to 42%. Furthermore, compared with LCA, a much higher-complexity algorithm, LLB obtains comparable results for fine-grain graphs and yields improvements up to 16% for coarse-grain graphs.

[1]  Ishfaq Ahmad,et al.  A New Approach to Scheduling Parallel Programs Using Task Duplication , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[2]  Arjan J. C. van Gemund,et al.  Spar: A Programming Language for Semi-Automatic Compilation of Parallel Programs , 1997, Concurr. Pract. Exp..

[3]  Arjan J. C. van Gemund,et al.  GLB: a low-cost scheduling algorithm for distributed-memory architectures , 1998, Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238).

[4]  Frank D. Anger,et al.  Scheduling Precedence Graphs in Systems with Interprocessor Communication Times , 1989, SIAM J. Comput..

[5]  Tao Yang,et al.  DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors , 1994, IEEE Trans. Parallel Distributed Syst..

[6]  Behrooz Shirazi,et al.  DFRN: a new approach for duplication based scheduling for distributed memory multiprocessor systems , 1997, Proceedings 11th International Parallel Processing Symposium.

[7]  Daniel Gajski,et al.  Hypertool: A Programming Aid for Message-Passing Systems , 1990, IEEE Trans. Parallel Distributed Syst..

[8]  Ronald L. Graham,et al.  Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.

[9]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[10]  H. Sips,et al.  Automap: a Parallel Coordination-based Programming System 1 , 1997 .

[11]  Tao Yang,et al.  Scheduling and code generation for parallel architectures , 1993 .

[12]  Arjan J. C. van Gemund,et al.  Spar: A programming language for semi‐automatic compilation of parallel programs , 1997 .

[13]  Vivek Sarkar,et al.  Partitioning and scheduling parallel programs for execution on multiprocessors , 1987 .

[14]  Dharma P. Agrawal,et al.  Scalable scheduling algorithm for distributed memory machines , 1996, Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing.