Energy measurement and prediction for multi-threaded programs

For many years, runtime performance has been the main concern in high-performance and parallel computing. However, power and energy consumption become more and more important, especially for large-scale supercomputer systems. Accordingly, energy-saving methods, such as dynamic voltage frequency scaling, have been integrated in recent processor architectures. In this article, we investigate the power and energy consumption for the execution of multi-threaded programs on recent Intel multicore processors, using the PARSEC and the SPLASH benchmarks as examples. In particular, we perform experiments for measuring the power and energy consumption with two different methods. Based on the measured energy values, we investigate power and energy models suggested in the literature as well as a new heuristic model and compare the models with respect to their prediction capabilities. We show that especially the heuristic model is able to quite accurately predict the power and energy consumption depending on the scaled frequencies.

[1]  Chaitali Chakrabarti,et al.  Energy-efficient dynamic task scheduling algorithms for DVS systems , 2008, TECS.

[2]  Rami G. Melhem,et al.  Corollaries to Amdahl's Law for Energy , 2008, IEEE Computer Architecture Letters.

[3]  Efraim Rotem,et al.  Power-Management Architecture of the Intel Microarchitecture Code-Named Sandy Bridge , 2012, IEEE Micro.

[4]  Martin Schulz,et al.  Practical performance prediction under Dynamic Voltage Frequency Scaling , 2011, 2011 International Green Computing Conference and Workshops.

[5]  Sandy Irani,et al.  Algorithms for power savings , 2003, SODA '03.

[6]  Gurindar S. Sohi,et al.  A static power model for architects , 2000, MICRO 33.

[7]  M.R. Greenstreet,et al.  Computation with Energy-Time Trade-Offs: Models, Algorithms and Lower-Bounds , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[8]  Gul A. Agha,et al.  Towards optimizing energy costs of algorithms for shared memory architectures , 2010, SPAA '10.

[9]  Gerhard Wellein,et al.  LIKWID: Lightweight Performance Tools , 2011, CHPC.

[10]  Kai Li,et al.  PARSEC vs. SPLASH-2: A quantitative comparison of two multithreaded benchmark suites on Chip-Multiprocessors , 2008, 2008 IEEE International Symposium on Workload Characterization.

[11]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[12]  Eric Saxe,et al.  Power-efficient software , 2010, Commun. ACM.

[13]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[14]  Keqin Li,et al.  Performance Analysis of Power-Aware Task Scheduling Algorithms on Multiprocessor Computers with Dynamic Voltage and Speed , 2008, IEEE Transactions on Parallel and Distributed Systems.

[15]  Shuaiwen Song,et al.  Iso-Energy-Efficiency: An Approach to Power-Constrained Parallel Computation , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[16]  Rahul Khanna,et al.  RAPL: Memory power estimation and capping , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[17]  Gerhard Wellein,et al.  LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[18]  Rajesh K. Gupta,et al.  Leakage aware dynamic voltage scaling for real-time embedded systems , 2004, Proceedings. 41st Design Automation Conference, 2004..

[19]  Michael T. Goodrich,et al.  Fundamental parallel algorithms for private-cache chip multiprocessors , 2008, SPAA '08.

[20]  Jack J. Dongarra,et al.  Energy Footprint of Advanced Dense Numerical Linear Algebra Using Tile Algorithms on Multicore Architectures , 2012, 2012 Second International Conference on Cloud and Green Computing.

[21]  Margaret Martonosi,et al.  Computer Architecture Techniques for Power-Efficiency , 2008, Computer Architecture Techniques for Power-Efficiency.

[22]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.