A Comparison of Task-Duplication-Based Algorithms for Scheduling Parallel Programs to Message-Passing Systems

A major hurdle in achieving high performance in message-passing architectures is the inevitable communication overhead that occurs when tasks scheduled on different processors need to exchange data. This overhead can cause a stern penalty especially in distributed systems such as clusters of workstations, where the network channels are considerably slower than the processors. For a given parallel program represented by a task graph, the communication overhead can be mitigated by redundantly executing some tasks on which other tasks critically depend. There have been a few taskduplication based scheduling algorithms that are designed for such environments. Although these algorithms are independently shown to be effective, no attempt has been made to quantitatively compare their performance under a broad range of input parameters. In this paper we analyze the problem of using task-duplication in compile-time scheduling of task graphs on parallel and distributed systems. We discuss the characteristics of six recently proposed algorithms, and examine their merits, differences, and expediency for different environments. Through a comprehensive experimental evaluation, the six algorithms are compared in terms of schedule lengths, number of processors used, and the amount of scheduling time required.

[1]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[2]  Frank D. Anger,et al.  Scheduling Precedence Graphs in Systems with Interprocessor Communication Times , 1989, SIAM J. Comput..

[3]  Virgílio A. F. Almeida,et al.  Using random task graphs to investigate the potential benefits of heterogeneity in parallel systems , 1992, Proceedings Supercomputing '92.

[4]  Ishfaq Ahmad,et al.  Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors , 1996, IEEE Trans. Parallel Distributed Syst..

[5]  Boontee Kruatrachue,et al.  Grain size determination for parallel processing , 1988, IEEE Software.

[6]  Mihalis Yannakakis,et al.  Towards an Architecture-Independent Analysis of Parallel Algorithms , 1990, SIAM J. Comput..

[7]  S. Ranka,et al.  Applications and performance analysis of a compile-time optimization approach for list scheduling algorithms on distributed memory multiprocessors , 1992, Proceedings Supercomputing '92.

[8]  K. Mani Chandy,et al.  A comparison of list schedules for parallel processing systems , 1974, Commun. ACM.

[9]  T. C. Hu Parallel Sequencing and Assembly Line Problems , 1961 .

[10]  Ishfaq Ahmad,et al.  A New Approach to Scheduling Parallel Programs Using Task Duplication , 1994, 1994 Internatonal Conference on Parallel Processing Vol. 2.

[11]  S. P. Kumar,et al.  Solving Linear Algebraic Equations on an MIMD Computer , 1983, JACM.

[12]  Behrooz Shirazi,et al.  Comparative study of task duplication static scheduling versus clustering and non-clustering techniques , 1995, Concurr. Pract. Exp..

[13]  Hesham El-Rewini,et al.  Scheduling Parallel Program Tasks onto Arbitrary Target Machines , 1990, J. Parallel Distributed Comput..

[14]  Mario J. Gonzalez Deterministic Processor Scheduling , 1977, CSUR.

[15]  Edward G. Coffman,et al.  Computer and job-shop scheduling theory , 1976 .

[16]  Tao Yang,et al.  On the Granularity and Clustering of Directed Acyclic Task Graphs , 1993, IEEE Trans. Parallel Distributed Syst..

[17]  E.L. Lawler,et al.  Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey , 1977 .