Slow Down or Halt: Saving the Optimal Energy for Scalable HPC Systems

The presence of pervasive slack provides ample opportunities for achieving energy efficiency for HPC systems nowadays. Regardless of communication slack, classic energy saving approaches for saving energy during the slack otherwise include race-to-halt and CP-aware slack reclamation, which reply on power scaling techniques to adjust processor power states judiciously during the slack. Existing efforts demonstrate CP-aware slack reclamation is superior to race-to-halt in energy saving capability. In this paper, we formally model our observation that the energy saving capability gap between the two approaches is significantly narrowed down on today's processors, given that state-of-the-art CMOS technologies allow insignificant variation of supply voltage as operating frequency of a processor scales. Experimental results on a large-scale power-aware cluster validate our findings.

[1]  Martin Schulz,et al.  Bounding energy consumption in large-scale MPI programs , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[2]  David K. Lowenthal,et al.  Using multiple energy gears in MPI programs on a power-scalable cluster , 2005, PPoPP.

[3]  Hiroto Yasuura,et al.  Voltage scheduling problem for dynamically variable voltage processors , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[4]  Dong Li,et al.  HP-DAEMON: High Performance Distributed Adaptive Energy-efficient Matrix-multiplicatiON , 2014, ICCS.

[5]  Y. Taur,et al.  A continuous, analytic drain-current model for DG MOSFETs , 2004 .

[6]  Hui Liu,et al.  High performance linpack benchmark: a fault tolerant implementation without checkpointing , 2011, ICS '11.

[7]  Sriram Krishnamoorthy,et al.  Scalable work stealing , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[8]  Rong Ge,et al.  CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[9]  Bronis R. de Supinski,et al.  Adagio: making DVS practical for complex HPC applications , 2009, ICS.

[10]  Wei Wang,et al.  A continuous, analytic drain-current model for DG MOSFETs , 2004, IEEE Electron Device Letters.

[11]  Albert Y. Zomaya,et al.  Some observations on optimal frequency selection in DVFS-based energy consumption minimization , 2011, J. Parallel Distributed Comput..