PoweRock: Power Modeling and Flexible Dynamic Power Management for Many-Core Architectures

Energy-efficient design for green high-performance computing (HPC) is a formidable mission for today’s computer scientists. Dynamic power management (DPM) is a key enabling technology involved. While DPM can have various goals for different application scenarios (e.g., enforcing an upper bound on power consumption or optimizing energy usage under a performance-loss constraint), existing DPM solutions are generally designed to meet only one goal and not adaptable to changes in optimization objectives. This paper proposes a novel flexible DPM approach based on a profile-guided dynamic voltage/frequency scaling (DVFS) scheme to meet the different goals. Our contributions include 1) an accurate power prediction model for many-core architectures; 2) a new profiling method for (distributed) shared-memory parallel applications to flexibly determine the optimal frequency and voltage for different phases of the execution; and 3) a hierarchical domain-aware power control design boosting the DPM system scalability for many-core chips. We implement the approach into a working library dubbed PoweRock and evaluate it on the Intel SCC port of the Barrelfish operating system. Experimental results obtained from several well-known benchmarks show that PoweRock attains significant energy and energy-delay product (EDP) improvements (average: 37.1% and 25.5%; best case: 64.0% and 65.4%, respectively) over a static power scheme.

[1]  Sherief Reda,et al.  Pack & Cap: Adaptive DVFS and thread packing under power caps , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[2]  Kai Lu,et al.  A Power Provision and Capping Architecture for Large Scale Systems , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[3]  W. L. Bircher,et al.  Effective Use of Performance Monitoring Counters for Run-Time Prediction of Power , 2004 .

[4]  Martin Schulz,et al.  Practical performance prediction under Dynamic Voltage Frequency Scaling , 2011, 2011 International Green Computing Conference and Workshops.

[5]  Luca Benini,et al.  Single-Chip Cloud Computer thermal model , 2011, 2011 17th International Workshop on Thermal Investigations of ICs and Systems (THERMINIC).

[6]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[7]  Radu Prodan,et al.  Performance Analysis and Benchmarking of the Intel SCC , 2011, 2011 IEEE International Conference on Cluster Computing.

[8]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[9]  Luca Benini,et al.  Quantifying the impact of frequency scaling on the energy efficiency of the single-chip cloud computer , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[10]  Samuel Williams,et al.  The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .

[11]  M. Scott,et al.  Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor , 2003, 30th Annual International Symposium on Computer Architecture, 2003. Proceedings..

[12]  Li Shen,et al.  PPEP: Online Performance, Power, and Energy Prediction Framework and DVFS Space Exploration , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[13]  David K. Lowenthal,et al.  Using multiple energy gears in MPI programs on a power-scalable cluster , 2005, PPoPP.

[14]  Christoforos E. Kozyrakis,et al.  Dynamic management of TurboMode in modern multi-core chips , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[15]  Cho-Li Wang,et al.  Latency-Aware Dynamic Voltage and Frequency Scaling on Many-Core Architectures for Data-Intensive Applications , 2013, 2013 International Conference on Cloud Computing and Big Data.

[16]  Christoph W. Kessler,et al.  Modelling Power Consumption of the Intel SCC , 2012, MARC Symposium.

[17]  David K. Lowenthal,et al.  Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs , 2005 .

[18]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[19]  Luca Benini,et al.  SCC thermal model identification via advanced bias-compensated least-squares , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[20]  Yu Hua,et al.  Towards a cost-efficient MapReduce: Mitigating power peaks for Hadoop clusters , 2014 .

[21]  David H. Bailey,et al.  The NAS parallel benchmarks summary and preliminary results , 1991, Proceedings of the 1991 ACM/IEEE Conference on Supercomputing (Supercomputing '91).

[22]  Ying Wang,et al.  An Energy-Saving Task Scheduling Strategy Based on Vacation Queuing Theory in Cloud Computing , 2015 .

[23]  Lizy Kurian John,et al.  Run-time modeling and estimation of operating system power consumption , 2003, SIGMETRICS '03.

[24]  Cho-Li Wang,et al.  Rhymes: A shared virtual memory system for non-coherent tiled many-core architectures , 2014, 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS).

[25]  Nikolas Ioannou,et al.  Phase-Based Application-Driven Hierarchical Power Management on the Single-chip Cloud Computer , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[26]  Margaret Martonosi,et al.  Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data , 2003, MICRO.

[27]  Frank Bellosa,et al.  Process cruise control: event-driven clock scaling for dynamic power management , 2002, CASES '02.

[28]  Robert L. Grossman,et al.  Malstone: towards a benchmark for analytics on large data clouds , 2010, KDD '10.

[29]  Saurabh Dighe,et al.  A 48-Core IA-32 Processor in 45 nm CMOS Using On-Die Message-Passing and DVFS for Performance and Power Scaling , 2011, IEEE Journal of Solid-State Circuits.

[30]  Kai Ma,et al.  Scalable power control for many-core architectures running multi-threaded applications , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[31]  Dong Li,et al.  Strategies for Energy-Efficient Resource Management of Hybrid Programming Models , 2013, IEEE Transactions on Parallel and Distributed Systems.