A Duplication Based Compile Time Scheduling Method for Task Parallelism

The cost of inter-processor communication is one of the major bottlenecks of a distributed memory machine (DMM) which can be offset with efficient algorithms for task partitioning and scheduling. Based on the data dependencies, the task partitioning algorithm partitions the application program into tasks and represents them in the form of a directed acyclic graph (DAG) or in compiler intermediate forms. The scheduling algorithm schedules the tasks onto individual processors of the DMM in an effort to lower the overall parallel time. It has been long proven that obtaining an optimal schedule for a generic DAG is an NP-hard problem. This chapter presents a Scalable Task Duplication based Scheduling (STDS) algorithm which can schedule the tasks of a DAG with a worst case complexity of O(|v|2), where v is the set of tasks of the DAG. STDS algorithm generates an optimal schedule for a certain class of DAGs which satisfy a Cost Relationship Condition (CRC), provided the required number of processors are available. In case the required number of processors are not available the algorithm scales the schedule down to the available number of processors. The performance of the scheduling algorithm has been evaluated by its application to practical DAGs and by comparing the parallel time of the schedule generated against the absolute or the theoretical lowerbound.

[1]  S. Ranka,et al.  Applications and performance analysis of a compile-time optimization approach for list scheduling algorithms on distributed memory multiprocessors , 1992, Proceedings Supercomputing '92.

[2]  Ishfaq Ahmad,et al.  A New Approach to Scheduling Parallel Programs Using Task Duplication , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[3]  Frank D. Anger,et al.  Scheduling Precedence Graphs in Systems with Interprocessor Communication Times , 1989, SIAM J. Comput..

[4]  Dharma P. Agrawal,et al.  A fast and scalable scheduling algorithm for distributed memory systems , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.

[5]  Boontee Kruatrachue,et al.  Static task scheduling and grain packing in parallel processing systems , 1987 .

[6]  Carolyn McCreary,et al.  A Comparison of Multiprocessor Scheduling Heuristics , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[7]  Thomas L. Casavant,et al.  A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems , 1988, IEEE Trans. Software Eng..

[8]  Jing-Jang Hwang,et al.  Multiprocessor scheduling with interprocessor communication delays , 1988 .

[9]  Dharma P. Agrawal,et al.  A Threshold Scheduling Strategy for Sisal on Distributed Memory Machines , 1994, J. Parallel Distributed Comput..

[10]  Kam-Hoi Cheng,et al.  List Scheduling of Parallel Tasks , 1991, Inf. Process. Lett..

[11]  Dharma P. Agrawal,et al.  A Task Duplication Based Scalable Scheduling Algorithm for Distributed Memory Systems , 1997, J. Parallel Distributed Comput..

[12]  Edward A. Lee,et al.  A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures , 1993, IEEE Trans. Parallel Distributed Syst..

[13]  Dharma P. Agrawal,et al.  Task scheduling algorithms for distributed memory systems , 1995 .

[14]  K. Mani Chandy,et al.  A comparison of list schedules for parallel processing systems , 1974, Commun. ACM.

[15]  Oscar H. Ibarra,et al.  On Mapping Systolic Algorithms onto the Hypercube , 1990, IEEE Trans. Parallel Distributed Syst..

[16]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[17]  Alfred V. Aho,et al.  The Design and Analysis of Computer Algorithms , 1974 .

[18]  Tao Yang,et al.  On the Granularity and Clustering of Directed Acyclic Task Graphs , 1993, IEEE Trans. Parallel Distributed Syst..

[19]  Kanad Ghose,et al.  A Bottom-Up Approach to Task Scheduling on Distributed Memory Multiprocessors , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[20]  E.L. Lawler,et al.  Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey , 1977 .

[21]  D. P. Agrawal,et al.  SDBS: a task duplication based optimal scheduling algorithm , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[22]  Yong Yuan Li,et al.  Scheduling a computational dag on a parallel system with communication delays and replication of node execution , 1993, [1993] Proceedings Seventh International Parallel Processing Symposium.

[23]  Dharma P. Agrawal,et al.  Scalable scheduling algorithm for distributed memory machines , 1996, Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing.

[24]  Vivek Sarkar,et al.  Partitioning and scheduling parallel programs for execution on multiprocessors , 1987 .

[25]  Brigitte Plateau,et al.  Building Synthetic Parallel Programs: the Project ALPES , 1992, Programming Environments for Parallel Computing.

[26]  Dharma P. Agrawal,et al.  Optimal Scheduling Algorithm for Distributed-Memory Machines , 1998, IEEE Trans. Parallel Distributed Syst..

[27]  Philippe Chrétienne,et al.  C.P.M. Scheduling with Small Communication Delays and Task Duplication , 1991, Oper. Res..

[28]  Tao Yang,et al.  A Comparison of Clustering Heuristics for Scheduling Directed Acycle Graphs on Multiprocessors , 1992, J. Parallel Distributed Comput..

[29]  B. J. Lageweg,et al.  Multiprocessor scheduling with communication delays , 1990, Parallel Comput..

[30]  Dharma P. Agrawal,et al.  A Scalable Scheduling Scheme for Functional Parallelism on Distributed Memory Multiprocessor Systems , 1995, IEEE Trans. Parallel Distributed Syst..

[31]  Hesham El-Rewini,et al.  Scheduling Parallel Program Tasks onto Arbitrary Target Machines , 1990, J. Parallel Distributed Comput..

[32]  Vivek Sarkar,et al.  Partitioning and Scheduling Parallel Programs for Multiprocessing , 1989 .