Improving First-Come-First-Serve Job Scheduling by Gang Scheduling

We present a new scheduling method for batch jobs on massively parallel processor architectures. This method is based on the First-come-first-serve strategy and emphasizes the notion of fairness. Severe fragmentation is prevented by using gang scheduling which is only initiated by highly parallel jobs. Good worst-case behavior of the scheduling approach has already been proven by theoretical analysis. In this paper we show by simulation with real workload data that the algorithm is also suitable to be applied in real parallel computers. This holds for several different scheduling criteria like makespan or sum of the flow times. Simulation is also used for determination of the best parameter set for the new method.

[1]  Larry Rudolph,et al.  Gang Scheduling Performance Benefits for Fine-Grain Synchronization , 1992, J. Parallel Distributed Comput..

[2]  Steven Hotovy,et al.  Workload Evolution on the Cornell Theory Center IBM SP2 , 1996, JSSPP.

[3]  Uwe Schwiegelshohn,et al.  Analysis of first-come-first-serve parallel job scheduling , 1998, SODA '98.

[4]  Dror G. Feitelson,et al.  Improved Utilization and Responsiveness with Gang Scheduling , 1997, JSSPP.

[5]  Larry Rudolph,et al.  Towards Convergence in Job Schedulers for Parallel Supercomputers , 1996, JSSPP.

[6]  Larry Rudolph,et al.  Parallel Job Scheduling: Issues and Approaches , 1995, JSSPP.

[7]  Mary K. Vernon,et al.  Use of application characteristics and limited preemption for run-to-completion parallel processor scheduling policies , 1994, SIGMETRICS.

[8]  Seiki Kyan,et al.  Worst Case Bound of an LRF Schedule for the Mean Weighted Flow-Time Problem , 1986, SIAM J. Comput..

[9]  Uwe Schwiegelshohn,et al.  Theory and Practice in Parallel Job Scheduling , 1997, JSSPP.

[10]  Larry Rudolph,et al.  Evaluation of Design Choices for Gang Scheduling Using Distributed Hierarchical Control , 1996, J. Parallel Distributed Comput..

[11]  Stefano Leonardi,et al.  Approximating total flow time on parallel machines , 1997, STOC '97.

[12]  Ronald L. Graham,et al.  Bounds for Multiprocessor Scheduling with Resource Constraints , 1975, SIAM J. Comput..

[13]  Anja Feldmann,et al.  Dynamic scheduling on parallel machines , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[14]  Marios C. Papaefthymiou,et al.  A Gang Scheduling Design for Multiprogrammed Parallel Computing Environments , 1996, JSSPP.

[15]  Dror G. Feitelson,et al.  Packing Schemes for Gang Scheduling , 1996, JSSPP.

[16]  Uwe Schwiegelshohn Preemptive Weighted Completion Time Scheduling of Parallel Jobs , 1996, ESA.

[17]  L. Rudolph,et al.  Gang scheduling for highly efficient, distributed multiprocessor systems , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[18]  Dror G. Feitelson,et al.  Job Characteristics of a Production Parallel Scientivic Workload on the NASA Ames iPSC/860 , 1995, JSSPP.

[19]  Marios C. Papaefthymiou,et al.  Performance Evaluation of Gang Scheduling for Parallel and Distributed Multiprogramming , 1997, JSSPP.

[20]  Fang Wang,et al.  Modeling of Workload in MPPs , 1997, JSSPP.

[21]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.