Power-Aware Parallel Job Scheduling

Recent increase in performance of High Performance Computing (HPC) centers has been followed by even higher increase in power consumption. Power draw of modern supercomputers is not only an economic problem but it has negative consequences on environment. Roughly speaking, CPU power presents 50% of total system power. Dynamic Voltage Frequency Scaling(DVFS) is a technique widely used to manage CPU power. The level of parallel job scheduling presents a good place for power management as the scheduler is aware of the whole system: current load, running jobs, waiting jobs and their wait times. This talk explains two power-aware parallel job scheduling policies that trade performance for energy trying to minimize the performance penalty. The first policy assigns job frequency based on predicted job performance while the other uses system utilization to decide when to run jobs at reduced frequency. In the end, a power budgeting policy will be described since power budgeting has become very important for reasons such as existing infrastructure limitations, reliability and/or carbon footprint. Interestingly, it shows that the DVFS technique can even improve overall job performance in case of a given power budget.

[1]  David K. Lowenthal,et al.  Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs , 2005 .

[2]  Mateo Valero,et al.  Optimizing job performance under a given power constraint in HPC centers , 2010, International Conference on Green Computing.

[3]  Dong Li,et al.  PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications , 2010, IEEE Transactions on Parallel and Distributed Systems.

[4]  Mateo Valero,et al.  BSLD threshold driven power management policy for HPC centers , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[5]  Mateo Valero,et al.  Utilization driven power-aware parallel job scheduling , 2010, Computer Science - Research and Development.

[6]  Wu-chun Feng,et al.  A Power-Aware Run-Time System for High-Performance Computing , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[7]  D.K. Lowenthal,et al.  Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[8]  Evgenia Smirni,et al.  Power-aware resource allocation in high-end systems via online simulation , 2005, ICS '05.