Selective Reservation Strategies for Backfill Job Scheduling

Although there is wide agreement that backfilling produces significant benefits in scheduling of parallel jobs, there is no clear consensus on which backfilling strategy is preferable - should conservative backfilling be used or the more aggressive EASY backfilling scheme. Using trace-based simulation, we show that if performance is viewed within various job categories based on their width (processor request size) and length (job duration), some consistent trends may be observed. Using insights gleaned by the characterization, we develop a selective reservation strategy for backfill scheduling. We demonstrate that the new scheme is better than both conservative and aggressive backfilling.We also consider the issue of fairness in job scheduling and develop a new quantitative approach to its characterization. We show that the newly proposed schemes are also comparable or better than aggressive backfilling with respect to the fairness criterion.

[1]  Uwe Schwiegelshohn,et al.  On the Design and Evaluation of Job Scheduling Algorithms , 1999, JSSPP.

[2]  Bernd Freisleben,et al.  A comparative study of online scheduling algorithms for networks of workstations , 2000, Cluster Computing.

[3]  Dror G. Feitelson Analyzing the Root Causes of Performance Evaluation Results , 2002 .

[4]  Dror G. Feitelson,et al.  Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling , 2001, IEEE Trans. Parallel Distributed Syst..

[5]  James Patton Jones,et al.  Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization , 1999, JSSPP.

[6]  Peter J. Keleher,et al.  Randomization, Speculation, and Adaptation in Batch Schedulers , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[7]  Dror G. Feitelson,et al.  Supporting priorities and improving utilization of the IBM SP scheduler using slack-based backfilling , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[8]  Dmitry N. Zotkin,et al.  Attacking the bottlenecks of backfilling schedulers , 2004, Cluster Computing.

[9]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.

[10]  P. Sadayappan,et al.  Characterization of backfilling strategies for parallel job scheduling , 2002, Proceedings. International Conference on Parallel Processing Workshop.

[11]  Achim Streit On Job Scheduling for HPC-Clusters and the dynP Scheduler , 2001, HiPC.

[12]  Uwe Schwiegelshohn,et al.  Theory and Practice in Parallel Job Scheduling , 1997, JSSPP.

[13]  Kento Aida Effect of Job Size Characteristics on Job Scheduling Performance , 2000, JSSPP.

[14]  Honbo Zhou,et al.  The EASY - LoadLeveler API Project , 1996, JSSPP.