Computation-Aware Dynamic Frequency Scaling: Parsimonious Evaluation of the Time-Energy Trade-Off Using Design of Experiments

A promising approach to improve the energy-efficiency of HPC applications is to apply energy-saving techniques for different code regions according to their characteristics (blocking communication, load imbalance). Since most applications have many parallel code regions, this strategy requires extensive experimental time to find all the time-energy trade-offs for a given application. In this paper we make use of Design of Experiments (DoE) to (1) reduce the experimental time considering a parsimonious evaluation of execution time and energy; and (2) define the Pareto front with all interesting time-energy trade-offs. We report the use of our methodology for seven benchmarks, each with interesting Pareto fronts with distinct shapes. Among them, out of the 25 parallel regions of the MiniFE benchmark, we detect configurations which reduce energy in 9.27% with a non-significant penalty in runtime when compared with using the high frequency for all regions; and, for the Graph500 benchmark with 17 parallel regions, 7.0% execution time reduction with a increase of 2.4% in energy consumption, when comparing against running all regions in the lowest frequency.

[1]  David K. Lowenthal,et al.  Using multiple energy gears in MPI programs on a power-scalable cluster , 2005, PPoPP.

[2]  Abhinav Vishnu,et al.  Energy Templates: Exploiting Application Information to Save Energy , 2011, 2011 IEEE International Conference on Cluster Computing.

[3]  Mitesh R. Meswani,et al.  Reducing Energy Usage with Memory and Computation-Aware Dynamic Frequency Scaling , 2011, Euro-Par.

[4]  Ananta Tiwari,et al.  PMaC's green queue: a framework for selecting energy optimal DVFS configurations in large scale MPI applications , 2016, Concurr. Comput. Pract. Exp..

[5]  Timothy W. Simpson,et al.  Metamodels for Computer-based Engineering Design: Survey and recommendations , 2001, Engineering with Computers.

[6]  Dong Li,et al.  PowerPack: Energy Profiling and Analysis of High-Performance Systems and Applications , 2010, IEEE Transactions on Parallel and Distributed Systems.

[7]  J. S. Hunter,et al.  Statistics for Experimenters: Design, Innovation, and Discovery , 2006 .

[8]  Gabriel Wittum,et al.  Utilization of empirically determined energy-optimal CPU-frequencies in a numerical simulation code , 2015, Comput. Vis. Sci..

[9]  Wu-chun Feng,et al.  The Green500 List: Encouraging Sustainable Supercomputing , 2007, Computer.

[10]  Bronis R. de Supinski,et al.  Adagio: making DVS practical for complex HPC applications , 2009, ICS.

[11]  Philippe Olivier Alexandre Navaux,et al.  Saving energy by exploiting residual imbalances on iterative applications , 2014, 2014 21st International Conference on High Performance Computing (HiPC).

[12]  Dhabaleswar K. Panda,et al.  A case for application-oblivious energy-efficient MPI runtime , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[13]  C. F. Jeff Wu,et al.  Experiments: Planning, Analysis, and Parameter Design Optimization , 2000 .

[14]  Ananta Tiwari,et al.  Efficient speed (ES): Adaptive DVFS and clock modulation for energy efficiency , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).

[15]  Guy E. Blelloch,et al.  Brief announcement: the problem based benchmark suite , 2012, SPAA '12.

[16]  D.K. Lowenthal,et al.  Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[17]  Mitsuhisa Sato,et al.  Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[18]  Min Yeol Lim,et al.  Adaptive, transparent CPU scaling algorithms leveraging inter-node MPI communication regions , 2011, Parallel Comput..

[19]  Wu-chun Feng,et al.  A Feasibility Analysis of Power Awareness in Commodity-Based High-Performance Clusters , 2005, 2005 IEEE International Conference on Cluster Computing.

[20]  Sandia Report,et al.  Improving Performance via Mini-applications , 2009 .

[21]  Ananta Tiwari,et al.  Green Queue: Customized Large-Scale Clock Frequency Scaling , 2012, 2012 Second International Conference on Cloud and Green Computing.

[22]  Christopher J. Nachtsheim,et al.  A Class of Three-Level Designs for Definitive Screening in the Presence of Second-Order Effects , 2011 .