FLB: Fast Load Balancing for distributed-memory machines

This paper describes a novel compile-time list-based task scheduling algorithm for distributed-memory systems, called Fast Load Balancing (FLB). Compared to other typical list scheduling heuristics, FLB drastically reduces scheduling time complexity to O(V(log(W)+log (P))+E), where V and E are the number of tasks and edges in the task graph, respectively, W is the task graph width and P is the number of processors. It is proven that FLB is essentially equivalent to the existing ETF scheduling algorithm of O(W(E+V)P) time complexity. Experiments also show that FLB performs equally to other one-step algorithms of much higher cost, such as MCP. Moreover, FLB consistently outperforms multi-step algorithms such as DSC-LLB that also have higher cost.

[1]  Ishfaq Ahmad,et al.  Benchmarking the task graph scheduling algorithms , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[2]  S. Ranka,et al.  Applications and performance analysis of a compile-time optimization approach for list scheduling algorithms on distributed memory multiprocessors , 1992, Proceedings Supercomputing '92.

[3]  Boontee Kruatrachue,et al.  Grain size determination for parallel processing , 1988, IEEE Software.

[4]  Arjan J. C. van Gemund,et al.  On the complexity of list scheduling algorithms for distributed-memory systems , 1999, ICS '99.

[5]  Arjan J. C. van Gemund,et al.  LLB: A fast and effective scheduling algorithm for distributed-memory systems , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[6]  Daniel Gajski,et al.  Hypertool: A Programming Aid for Message-Passing Systems , 1990, IEEE Trans. Parallel Distributed Syst..

[7]  Frank D. Anger,et al.  Scheduling Precedence Graphs in Systems with Interprocessor Communication Times , 1989, SIAM J. Comput..

[8]  Ishfaq Ahmad,et al.  A New Approach to Scheduling Parallel Programs Using Task Duplication , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[9]  Edward A. Lee,et al.  A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures , 1993, IEEE Trans. Parallel Distributed Syst..

[10]  Tao Yang,et al.  DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors , 1994, IEEE Trans. Parallel Distributed Syst..

[11]  Vivek Sarkar,et al.  Partitioning and scheduling parallel programs for execution on multiprocessors , 1987 .