The maximal utilization of processor co-allocation in multicluster systems

In systems consisting of multiple clusters of processors which employ space sharing for scheduling jobs, such as our distributed ASCI supercomputer (DAS), co-allocation, i.e., the simultaneous allocation of processors to single jobs in multiple clusters, may be required. In studies of scheduling in single clusters it has been shown that the achievable (maximal) utilization may be much less than 100%, a problem that may be aggravated in multicluster systems. In this paper we study the maximal utilization when co-allocating jobs in multicluster systems, both with analytic means (we derive exact and approximate formulas when the service-time distribution is exponential), and with simulations with synthetic workloads and with workloads derived from the logs of actual systems.

[1]  Peter M. A. Sloot,et al.  The distributed ASCI Supercomputer project , 2000, OPSR.

[2]  Anca I. D. Bucur,et al.  The Influence of Communication on the Performance of Co-allocation , 2001, JSSPP.

[3]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[4]  Anca I. D. Bucur,et al.  The Influence of the Structure and Sizes of Jobs on the Performance of Co-allocation , 2000, JSSPP.

[5]  Ian T. Foster,et al.  Resource co-allocation in computational grids , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[6]  Raymond M. Bryant Maximum Processing Rates of Memory Bound Systems , 1982, JACM.

[7]  James Patton Jones,et al.  Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization , 1999, JSSPP.

[8]  Mary K. Vernon,et al.  Characteristics of a Large Shared Memory Production Workload , 2001, JSSPP.

[9]  Henri E. Bal,et al.  MagPIe: MPI's collective communication operations for clustered wide area systems , 1999, PPoPP '99.

[10]  Anca I. D. Bucur,et al.  Local versus Global Schedulers with Processor Co-allocation in Multicluster Systems , 2002, JSSPP.

[11]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[12]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.

[13]  Uwe Schwiegelshohn,et al.  Theory and Practice in Parallel Job Scheduling , 1997, JSSPP.

[14]  Henri E. Bal,et al.  Optimizing parallel applications for wide-area clusters , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[15]  Uwe Schwiegelshohn,et al.  On Advantages of Grid Computing for Parallel Job Scheduling , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).