Adaptive Scheduling of Computations and Communications on Distributed-Memory Systems

Compile-time scheduling is one approach to extract parallelism which has proved effective when the execution behavior is predictable. Unfortunately, the performance of most priority-based scheduling algorithms is computation dependent. Scheduling based on the concept of earliest-startable-task produces reasonably short schedules only when available parallelism is large enough to cover the communications. A priority-based decision is more effective when parallelism is low. We propose a scheduling in which the decision function combines two concepts: (1) task-level as global priority and (2) earliest-task-first as local priority The degree of dominance of one of the above concepts is controlled by a computation profile factor that is the ratio of task parallelism to communication. It is shown that the above factor is an upper bound on the deviation of schedule length from optimum. To tune the solution finish time the above scheduler is iteratively applied on the computation graph. In each iteration, the newly generated schedule is used to sharpen the task-levels which contribute in finding shorter schedules in the next iteration. Evaluation is carried out for a wide category of computation graphs with communications for which optimum schedules are known. It is found that pure local scheduling and static priority-based scheduling significantly deviate from the optimum under specific problem instances. Our approach to adapting the scheduling decision to the computation profile is able to produce near-optimum solutions via a much reduced number of iterations than other approaches.

[1]  Ishfaq Ahmad,et al.  Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors , 1996, IEEE Trans. Parallel Distributed Syst..

[2]  Robert G. Babb,et al.  A comparative analysis of static parallel schedulers where communication costs are significant , 1989 .

[3]  Frank D. Anger,et al.  Scheduling Precedence Graphs in Systems with Interprocessor Communication Times , 1989, SIAM J. Comput..

[4]  Daniel Gajski,et al.  Hypertool: A Programming Aid for Message-Passing Systems , 1990, IEEE Trans. Parallel Distributed Syst..

[5]  Prithviraj Banerjee,et al.  ESp: Placement by simulated evolution , 1989, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[6]  Mayez A. Al-Mouhamed,et al.  Performance Evaluation of Scheduling Precedence-Constained Computations on Message-Passing Systems , 1994, IEEE Trans. Parallel Distributed Syst..

[7]  Tao Yang,et al.  DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors , 1994, IEEE Trans. Parallel Distributed Syst..

[8]  Mihalis Yannakakis,et al.  Towards an Architecture-Independent Analysis of Parallel Algorithms , 1990, SIAM J. Comput..

[9]  Masahiro Tsuchiya,et al.  A Task Allocation Model for Distributed Computing Systems , 1982, IEEE Transactions on Computers.

[10]  J. Sheild,et al.  Partitioning concurrent VLSI simulation programs onto a multiprocessor by simulated annealing , 1987 .

[11]  Vivek Sarkar,et al.  Compile-time partitioning and scheduling of parallel programs , 1986, SIGPLAN '86.

[12]  K. Mani Chandy,et al.  A comparison of list schedules for parallel processing systems , 1974, Commun. ACM.

[13]  Boontee Kruatrachue,et al.  Static task scheduling and grain packing in parallel processing systems , 1987 .

[14]  Mayez A. Al-Mouhamed,et al.  Analysis of Macro-Dataflow Dynamic Scheduling on Nonuniform Memory Access Architectures , 1993, IEEE Trans. Parallel Distributed Syst..

[15]  Mayez A. Al-Mouhamed,et al.  Lower Bound on the Number of Processors and Time for Scheduling Precedence Graphs with Communication Costs , 1990, IEEE Trans. Software Eng..

[16]  Edward G. Coffman,et al.  Computer and job-shop scheduling theory , 1976 .

[17]  S. Darbha,et al.  Effect of variation in compile time costs on scheduling tasks on distributed memory systems , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[18]  James C. Browne,et al.  General approach to mapping of parallel computations upon multiprocessor architectures , 1988 .

[19]  Santosh Pande,et al.  Program Repartitioning on Varying Communication Cost Parallel Architectures , 1996, J. Parallel Distributed Comput..

[20]  Nirwan Ansari,et al.  A Genetic Algorithm for Multiprocessor Scheduling , 1994, IEEE Trans. Parallel Distributed Syst..

[21]  Edward A. Lee,et al.  A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures , 1993, IEEE Trans. Parallel Distributed Syst..

[22]  Mayez A. Al-Mouhamed,et al.  Scheduling optimization through iterative refinement , 1995, J. Syst. Archit..

[23]  Ronald L. Graham,et al.  Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.