Gang scheduling and adaptive resource allocation to mitigate advance reservation impact

Simultaneous parallel computational grid jobs require reservation by the local job schedulers to ensure allocation of matching time slots at the different sites involved. However, reservations create road blocks in the local schedule, leading to only a small percentage of reservations being tolerable. A large number of reservations typically has adverse effects on local response times and machine utilization. We have extended our SCOJO scheduler to enable advance reservations. SCOJO can perform space sharing or gang scheduling and can run as either adaptive or traditional non-adaptive variant. We show that gang scheduling is more flexible than space sharing in regards to tolerating reservations. We also show that, for space sharing and a low multiprogramming level, the adaptive variants can tolerate reservations better than the non-adaptive variants.

[1]  Mark S. Squillante,et al.  Processor Allocation in Multiprogrammed Distributed-Memory Parallel Computer Systems , 1997, J. Parallel Distributed Comput..

[2]  Kenneth C. Sevcik,et al.  Implementing Multiprocessor Scheduling Disciplines , 1997, JSSPP.

[3]  Mark J. Clement,et al.  The Performance Impact of Advance Reservation Meta-scheduling , 2000, JSSPP.

[4]  Jesús Labarta,et al.  Improving Gang Scheduling through job performance analysis and malleability , 2001, ICS '01.

[5]  Ian T. Foster,et al.  SNAP: A Protocol for Negotiating Service Level Agreements and Coordinating Resource Management in Distributed Systems , 2002, JSSPP.

[6]  Junwei Cao,et al.  Queue scheduling and advance reservations with COSY , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[7]  Warren Smith,et al.  Scheduling with advanced reservations , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[8]  Dror G. Feitelson,et al.  Job Scheduling in Multiprogrammed Parallel Systems , 1997 .

[9]  Dror G. Feitelson,et al.  The workload on parallel supercomputers: modeling the characteristics of rigid jobs , 2003, J. Parallel Distributed Comput..

[10]  Mark J. Clement,et al.  Core Algorithms of the Maui Scheduler , 2001, JSSPP.

[11]  Francine Berman,et al.  When the Herd Is Smart: Aggregate Behavior in the Selection of Job Request , 2003, IEEE Trans. Parallel Distributed Syst..

[12]  Angela C. Sodan,et al.  Loosely coordinated coscheduling in the context of other approaches for dynamic job scheduling: a survey , 2005, Concurr. Comput. Pract. Exp..

[13]  Ramin Yahyapour,et al.  Economic Scheduling in Grid Computing , 2002, JSSPP.