Using Queue Time Predictions for Processor Allocation

When a malleable job is submitted to a space-sharing parallel computer, it must choose often whether to begin execution on a small, available cluster, or wait in queue for more processors to become available. To make this decision, it must predict how long it will have to wait for the larger clsuter. We propose statistical techniques for predicting these queue times, and develop an allocation strategy that uses these predictions. We present a workload model based on the environment we have observed at the San Diego Supercomputer Center, and use this model to drive simulations of various allocation strategies. We conclude that prediction-based allocation not only improves the average turnaround time for the jobs; it also improves the utilization of the system as a whole.

[1]  Larry Rudolph,et al.  Towards Convergence in Job Schedulers for Parallel Supercomputers , 1996, JSSPP.

[2]  Alexander Reinefeld,et al.  MARS - A framework for minimizing the job execution time in a metacomputing environment , 1996, Future Gener. Comput. Syst..

[3]  Satish K. Tripathi,et al.  The Processor Working Set and Its Use in Scheduling Multiprocessor Systems , 1991, IEEE Trans. Software Eng..

[4]  Satish K. Tripathi,et al.  An analysis of several processor partitioning policies for parallel computers , 1991 .

[5]  Mark S. Squillante,et al.  Performance analysis of job scheduling policies in parallel supercomputing environments , 1993, Supercomputing '93. Proceedings.

[6]  Giuseppe Serazzi,et al.  Analysis of Non-Work-Conserving Processor Partitioning Policies , 1995, JSSPP.

[7]  Satish K. Tripathi,et al.  A Comparative Analysis of Static Processor Partitioning Policies for Parallel Computers , 1993, MASCOTS.

[8]  Dror G. Feitelson,et al.  Job Scheduling in Multiprogrammed Parallel Systems , 1997 .

[9]  Dror G. Feitelson,et al.  Job Characteristics of a Production Parallel Scientivic Workload on the NASA Ames iPSC/860 , 1995, JSSPP.

[10]  Bill Nitzberg,et al.  A comparison of workload traces from two production parallel machines , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[11]  Francine Berman,et al.  Scheduling from the perspective of the application , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[12]  Mary K. Vernon,et al.  Use of application characteristics and limited preemption for run-to-completion parallel processor scheduling policies , 1994, SIGMETRICS.

[13]  Allen B. Downey,et al.  A parallel workload model and its implications for processor allocation , 1996, Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183).

[14]  Dan C. Marinescu,et al.  Models and Algorithms for Coscheduling Compute-Intensive Tasks on a Network of Workstations , 1992, J. Parallel Distributed Comput..

[15]  Larry Rudolph,et al.  Evaluation of Design Choices for Gang Scheduling Using Distributed Hierarchical Control , 1996, J. Parallel Distributed Comput..

[16]  Honbo Zhou,et al.  The EASY - LoadLeveler API Project , 1996, JSSPP.

[17]  Shikharesh Majumdar,et al.  Scheduling in multiprogrammed parallel systems , 1988, SIGMETRICS 1988.

[18]  Kenneth C. Sevcik,et al.  Coordinated allocation of memory and processors in multiprocessors , 1996, SIGMETRICS '96.

[19]  Giuseppe Serazzi,et al.  Evaluation of Multiprocessor Allocation Policies , 1993 .

[20]  Giuseppe Serazzi,et al.  Robust Partitioning Policies of Multiprocessor Systems , 1994, Perform. Evaluation.

[21]  Allen B. Downey Predicting queue times on space-sharing parallel computers , 1997, Proceedings 11th International Parallel Processing Symposium.

[22]  Allen B. Downey,et al.  A Model For Speedup of Parallel Programs , 1997 .

[23]  Edward D. Lazowska,et al.  Speedup Versus Efficiency in Parallel Systems , 1989, IEEE Trans. Computers.

[24]  Kenneth C. Sevcik Characterizations of parallelism in applications and their use in scheduling , 1989, SIGMETRICS '89.