Characterization of Bandwidth-Aware Meta-Schedulers for Co-Allocating Jobs Across Multiple Clusters

In this paper, we present a bandwidth-centric job communication model that captures the interaction and impact of simultaneously co-allocating jobs across multiple clusters. We compare our dynamic model with previous research that utilizes a fixed execution time penalty for co-allocated jobs. We explore the interaction of simultaneously co-allocated jobs and the contention they often create in the network infrastructure of a dedicated computational multi-cluster.We also present several bandwidth-aware co-allocating meta-schedulers. These schedulers take inter-cluster network utilization into account as a means by which to mitigate degraded job run-time performance. We make use of a bandwidth-centric parallel job communication model that captures the time-varying utilization of shared inter-cluster network resources. By doing so, we are able to evaluate the performance of multi-cluster scheduling algorithms that focus not only on node resource allocation, but also on shared inter-cluster network bandwidth.

[1]  Anca I. D. Bucur,et al.  The Influence of Communication on the Performance of Co-allocation , 2001, JSSPP.

[2]  Achim Streit,et al.  Enhanced Algorithms for Multi-site Scheduling , 2002, GRID.

[3]  Anand Sivasubramaniam,et al.  An Integrated Approach to Parallel Scheduling Using Gang-Scheduling, Backfilling, and Migration , 2001, JSSPP.

[4]  Mark J. Clement,et al.  Core Algorithms of the Maui Scheduler , 2001, JSSPP.

[5]  William M. Jones,et al.  Bandwidth-aware co-allocating meta-schedulers for mini-grid architectures , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[6]  Carsten Franke,et al.  Job Scheduling Strategies for Parallel Processing , 2002, Lecture Notes in Computer Science.

[7]  Larry Rudolph,et al.  Job Scheduling Strategies for Parallel Processing: 7th International Workshop, JSSPP 2001, Cambridge, MA, USA, June 16, 2001, Revised Papers , 2001 .

[8]  Dick H. J. Epema,et al.  A Dynamic Co-allocation Service in Multicluster Systems , 2004, JSSPP.

[9]  Daniel C. Stanzione,et al.  Job communication characterization and its impact on meta-scheduling co-allocated jobs in a mini-grid , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[10]  Anca I. D. Bucur,et al.  The Performance of Processor Co-Allocation in Multicluster Systems , 2003, CCGRID.

[11]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.

[12]  Anca I. D. Bucur,et al.  The performance of processor co-allocation in multicluster systems , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[13]  P. Sadayappan,et al.  Characterization of backfilling strategies for parallel job scheduling , 2002, Proceedings. International Conference on Parallel Processing Workshop.

[14]  Uwe Schwiegelshohn,et al.  On Advantages of Grid Computing for Parallel Job Scheduling , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[15]  Dror G. Feitelson,et al.  Metrics for Parallel Job Scheduling and Their Convergence , 2001, JSSPP.