Linear programming based parallel job scheduling for power constrained systems

Power has become the primary constraint in high performance computing. Traditionally, parallel job scheduling policies have been designed to improve certain job performance metrics when scheduling parallel workloads on a system with a given number of processors. The available number of processors is not anymore the only limitation in parallel job scheduling. The recent increase in processor power consumption has resulted in a new limitation: the available power. Given constraints naturally lead to an optimization problem. In this paper we propose MaxJobPerf, a new parallel job scheduling policy based on integer linear programming. Dynamic Voltage Frequency Scaling (DVFS) is a widely used technique that running applications at reduced CPU frequency/voltage trades increased execution time for power reduction. The optimization problem determines which jobs should run and at which frequency. The MaxJobPerf policy clearly outperforms the other power budgeting approaches at the parallel job scheduling level.

[1]  Mateo Valero,et al.  Optimizing job performance under a given power constraint in HPC centers , 2010, International Conference on Green Computing.

[2]  Rajkumar Kettimuthu,et al.  Selective preemption strategies for parallel job scheduling , 2002, Proceedings International Conference on Parallel Processing.

[3]  Ron Brightwell,et al.  Characterizing application sensitivity to OS interference using kernel-level noise injection , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[4]  David K. Lowenthal,et al.  Using multiple energy gears in MPI programs on a power-scalable cluster , 2005, PPoPP.

[5]  David K. Lowenthal,et al.  Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[6]  Karl S. Hemmert Green HPC: From Nice to Necessity , 2010, Comput. Sci. Eng..

[7]  Dror G. Feitelson,et al.  Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling , 2001, IEEE Trans. Parallel Distributed Syst..

[8]  Dong Li,et al.  PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications , 2010, IEEE Transactions on Parallel and Distributed Systems.

[9]  Gurindar S. Sohi,et al.  A static power model for architects , 2000, MICRO 33.

[10]  Xiaorui Wang,et al.  Power capping: a prelude to power shifting , 2008, Cluster Computing.

[11]  J.C. Sancho,et al.  Quantifying the Potential Benefit of Overlapping Communication and Computation in Large-Scale Scientific Applications , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[12]  Evgenia Smirni,et al.  Power-aware resource allocation in high-end systems via online simulation , 2005, ICS '05.

[13]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[14]  Rajkumar Buyya,et al.  Power Aware Scheduling of Bag-of-Tasks Applications with Deadline Constraints on DVS-enabled Clusters , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[15]  Ulrich Kremer,et al.  The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction , 2003, PLDI '03.

[16]  Douglas Thain,et al.  Scheduling Grid workloads on multicore clusters to minimize energy and maximize performance , 2009, 2009 10th IEEE/ACM International Conference on Grid Computing.

[17]  Hiroshi Nakashima,et al.  Saving 200kW and $200 K/year by power-aware job/machine scheduling , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[18]  D.K. Lowenthal,et al.  Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[19]  Mitsuhisa Sato,et al.  Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[20]  Tugba Taskaya-Temizel,et al.  A Hadoop solution for ballistic image analysis and recognition , 2011, 2011 International Conference on High Performance Computing & Simulation.

[21]  Julita Corbalán,et al.  A Job Self-scheduling Policy for HPC Infrastructures , 2007, JSSPP.

[22]  R. Pearl Biometrics , 1914, The American Naturalist.

[23]  SangMin Lee,et al.  Development of large-scale structural analysis system on a supercomputer , 2011, 2011 International Conference on High Performance Computing & Simulation.

[24]  Xiaorui Wang,et al.  Cluster-level feedback power control for performance optimization , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[25]  Wu-chun Feng,et al.  A Power-Aware Run-Time System for High-Performance Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[26]  Chuang Liu,et al.  Online resource matching for heterogeneous grid environments , 2005, CCGrid 2005. IEEE International Symposium on Cluster Computing and the Grid, 2005..

[27]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[28]  Martin Schulz,et al.  Bounding energy consumption in large-scale MPI programs , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[29]  Paul W. Rendell A Universal Turing Machine in Conway's Game of Life , 2011, 2011 International Conference on High Performance Computing & Simulation.

[30]  Georgios Ch. Sirakoulis,et al.  Depicting pathways for cooperative miniature robots using Cellular Automata , 2011, 2011 International Conference on High Performance Computing & Simulation.

[31]  Loay D. Khalaf UWB antenna and LNA receiver simultaneous matching , 2011, 2011 International Conference on High Performance Computing & Simulation.

[32]  Philip S. Yu,et al.  Distributed hoeffding trees for pocket data mining , 2011, 2011 International Conference on High Performance Computing & Simulation.

[33]  Feng Pan,et al.  Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications , 2007, IEEE Transactions on Parallel and Distributed Systems.

[34]  Douglas G. Down,et al.  Power-Aware Linear Programming based Scheduling for heterogeneous computer clusters , 2010, International Conference on Green Computing.

[35]  Pierre-François Dutot,et al.  Bi-criteria algorithm for scheduling jobs on cluster platforms , 2004, SPAA '04.

[36]  Karthick Rajamani,et al.  A performance-conserving approach for reducing peak power consumption in server systems , 2005, ICS '05.