论文信息 - Balancing job performance with system performance via locality-aware scheduling on torus-connected systems

Balancing job performance with system performance via locality-aware scheduling on torus-connected systems

Torus-connected network is widely used in modern supercomputers due to its linear per node cost scaling and its competitive overall performance. Job scheduling system plays a critical role for the efficient use of supercomputers. As supercomputers continue growing in size, a fundamental problem arises: how to effectively balance job performance with system performance on torus-connected machines? In this work, we will present a new scheduling design named window-based locality-aware scheduling. Our design contains three novel features. First, rather than one-by-one job scheduling, our design takes a “window” of jobs, i.e. multiple jobs, into consideration for job prioritizing and resource allocation. Second, our design maintains a list of slots to preserve node contiguity information for resource allocation. Finally, we formulate our scheduling decision making into a 0-1 Multiple Knapsack Problem and present two algorithms to solve the problem. A series of trace-based simulations using job logs collected from production supercomputers indicate that this new scheduling design has real potentials and can effectively balance job performance and system performance.

[1] Dror G. Feitelson,et al. Utilization and Predictability in Scheduling the IBM SP2 with Backfilling , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[2] Phillip Krueger,et al. ob Scheduling is More Important than Processor Allocation for Hypercube Computers , 1994, IEEE Trans. Parallel Distributed Syst..

[3] Ibm Redbooks. IBM System Blue Gene Solution: Blue Gene/Q System Administration , 2012 .

[4] Zhiling Lan,et al. Analyzing and adjusting user runtime estimates to improve job scheduling on the Blue Gene/P , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[5] Katherine E. Isaacs,et al. There goes the neighborhood: Performance degradation due to nearby jobs , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[6] David S. Johnson,et al. Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[7] Esther M. Arkin,et al. Processor allocation on Cplant: achieving general processor locality using one-dimensional allocation strategies , 2002 .

[8] José E. Moreira,et al. Resource allocation and utilization in the Blue Gene/L supercomputer , 2005, IBM J. Res. Dev..

[9] Bill Nitzberg,et al. Noncontiguous Processor Allocation Algorithms for Mesh-Connected Multicomputers , 1997, IEEE Trans. Parallel Distributed Syst..

[10] Esther M. Arkin,et al. Processor allocation on Cplant: achieving general processor locality using one-dimensional allocation strategies , 2002, Proceedings. IEEE International Conference on Cluster Computing.

[11] Paolo Toth,et al. Heuristic algorithms for the multiple knapsack problem , 1981, Computing.

[12] Zhiling Lan,et al. Job scheduling with adjusted runtime estimates on production supercomputers , 2013, J. Parallel Distributed Comput..

[13] Laxmikant V. Kalé,et al. Application-specific topology-aware mapping for three dimensional topologies , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[14] Zhiling Lan,et al. Reducing Energy Costs for IBM Blue Gene/P via Power-Aware Job Scheduling , 2013, JSSPP.

[15] David S. Johnson,et al. Near-optimal bin packing algorithms , 1973 .

[16] Javier Navaridas,et al. Effects of Topology-Aware Allocation Policies on Scheduling Performance , 2009, JSSPP.

[17] Zhiling Lan,et al. Reducing Fragmentation on Torus-Connected Supercomputers , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[18] S. Martello,et al. Heuristische Algorithmen zur Packung von mehreren Rucksäcken , 1981 .

[19] David S. Johnson,et al. Fast Algorithms for Bin Packing , 1974, J. Comput. Syst. Sci..

[20] Xu Yang,et al. Integrating dynamic pricing of electricity into energy aware scheduling for HPC systems , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[22] Uwe Schwiegelshohn,et al. Parallel Job Scheduling - A Status Report , 2004, JSSPP.