Adaptive Voltage/Frequency Scaling and Core Allocation for Balanced Energy and Performance on Multicore CPUs

Energy efficiency is a known major concern for computing system designers. Significant effort is devoted to power optimization of modern systems, especially in largescale installations such as data centers, in which both high performance and energy efficiency are important. Power optimization can be achieved through different approaches, several of which focus on adaptive voltage regulation. In this paper, we present a comprehensive exploration of how two server-grade systems behave in different frequency and core allocation configurations beyond nominal voltage operation. Our analysis, which is built on top of two state-of-the-art ARMv8 microprocessor chips (Applied Micro’s X-Gene 2 and X-Gene 3) aims (1) to identify the best performance per watt operation points when the servers are operating in various voltage/frequency combinations, (2) to reveal how and why the different core allocation options on the available cores of the microprocessor affect the energy consumption, and (3) to enhance the default Linux scheduler to take task allocation decisions for balanced performance and energy efficiency. Our findings, on actual servers’ hardware, have been integrated into a lightweight online monitoring daemon which decides the optimal combination of voltage, core allocation, and clock frequency to achieve higher energy efficiency. Our approach reduces on average the energy by 25.2% on X-Gene 2, and 22.3% on X-Gene 3, with a minimal performance penalty of 3.2% on X-Gene 2 and 2.5% on X-Gene 3, compared to the default system configuration. Keywords-Energy efficiency; voltage and frequency scaling; power consumption; multicore characterization; micro-servers;

[1]  Dimitris Gizopoulos,et al.  Voltage margins identification on commercial x86-64 multicore microprocessors , 2017, 2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS).

[2]  Pradip Bose,et al.  Crank it up or dial it down: Coordinated multiprocessor frequency and folding control , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[3]  T. N. Vijaykumar,et al.  Pipeline muffling and a priori current ramping: architectural techniques to reduce high-frequency inductive noise , 2003, ISLPED '03.

[4]  Radu Teodorescu,et al.  Dynamic reduction of voltage margins by leveraging on-chip ECC in Itanium II processors , 2013, ISCA.

[5]  Eduard Ayguadé,et al.  CATA: Criticality Aware Task Acceleration for Multicore Processors , 2016, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[6]  Shidhartha Das,et al.  Analysis of adaptive clocking technique for resonant supply voltage noise mitigation , 2015, 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[7]  Shidhartha Das,et al.  Power Integrity Analysis of a 28 nm Dual-Core ARM Cortex-A57 Cluster Using an All-Digital Power Delivery Monitor , 2017, IEEE Journal of Solid-State Circuits.

[8]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[9]  Hiroshi Sasaki,et al.  Characterization and mitigation of power contention across multiprogrammed workloads , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).

[10]  Meeta Sharma Gupta,et al.  An event-guided approach to reducing voltage noise in processors , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[11]  Xiang Pan,et al.  VRSync: Characterizing and eliminating synchronization-induced voltage emergencies in many-core processors , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[12]  Shubhendu S. Mukherjee,et al.  APast Future Time Quantized AVF : A Means of Capturing Vulnerability Variations over Small Windows of Time , 2009 .

[13]  Michael D. Smith,et al.  Voltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[14]  Manish Gupta,et al.  Power-Aware Microarchitecture: Design and Modeling Challenges for Next-Generation Microprocessors , 2000, IEEE Micro.

[15]  Matthew Curtis-Maury,et al.  Improving the Efficiency of Parallel Applications on Multithreaded and Multicore Systems , 2008 .

[16]  Meeta Sharma Gupta,et al.  DeCoR: A Delayed Commit and Rollback mechanism for handling inductive noise in processors , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[17]  Lizy Kurian John,et al.  Automated di/dt stressmark generation for microprocessor power delivery networks , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[18]  Meeta Sharma Gupta,et al.  Towards a software approach to mitigate voltage emergencies , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[19]  Jingwen Leng,et al.  Adaptive guardband scheduling to improve system-level efficiency of the POWER7+ , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[20]  David M. Brooks,et al.  Accurate and efficient regression modeling for microarchitectural performance and power prediction , 2006, ASPLOS XII.

[21]  George Ho,et al.  PAPI: A Portable Interface to Hardware Performance Counters , 1999 .

[22]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[23]  Josep Torrellas,et al.  Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors , 2008, 2008 International Symposium on Computer Architecture.

[24]  Shidhartha Das,et al.  14.6 An all-digital power-delivery monitor for analysis of a 28nm dual-core ARM Cortex-A57 cluster , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[25]  Lizy Kurian John,et al.  AUDIT: Stress Testing the Automatic Way , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[26]  Pradip Bose,et al.  Safe limits on voltage reduction efficiency in GPUs: A direct measurement approach , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[27]  Radu Teodorescu,et al.  EmerGPU: Understanding and mitigating resonance-induced voltage noise in GPU architectures , 2016, 2016 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[28]  Performance Arms X-Gene 3 for Cloud , .

[29]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[30]  Margaret Martonosi,et al.  Stargazer: Automated regression-based GPU design space exploration , 2012, 2012 IEEE International Symposium on Performance Analysis of Systems & Software.

[31]  Bishop Brock,et al.  Active management of timing guardband to save energy in POWER7 , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[32]  Shidhartha Das,et al.  Measuring and Exploiting Guardbands of Server-Grade ARMv8 CPU Cores and DRAMs , 2018, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W).

[33]  Jian Li,et al.  Dynamic power-performance adaptation of parallel computation on chip multiprocessors , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[34]  Kapil Vaswani,et al.  Construction and use of linear regression models for processor performance analysis , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[35]  Jingwen Leng,et al.  GPU voltage noise: Characterization and hierarchical smoothing of spatial and temporal voltage noise interference in GPU architectures , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[36]  Rahul Khanna,et al.  RAPL: Memory power estimation and capping , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[37]  Shidhartha Das,et al.  An energy-efficient and error-resilient server ecosystem exceeding conservative scaling limits , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[38]  Gernot Heiser,et al.  Dynamic voltage and frequency scaling: the laws of diminishing returns , 2010 .

[39]  Li Zhou,et al.  Core tunneling: Variation-aware voltage noise mitigation in GPUs , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[40]  Dimitris Gizopoulos,et al.  Statistical Analysis of Multicore CPUs Operation in Scaled Voltage Conditions , 2018, IEEE Computer Architecture Letters.

[41]  Vivien Quéma,et al.  Thread and Memory Placement on NUMA Systems: Asymmetry Matters , 2015, USENIX Annual Technical Conference.

[42]  Samuel Naffziger,et al.  Adaptive Voltage Frequency Scaling Using Critical Path Accumulator Implemented in 28nm CPU , 2016, 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID).

[43]  Shidhartha Das,et al.  Modeling and characterization of the system-level Power Delivery Network for a dual-core ARM Cortex-A57 cluster in 28nm CMOS , 2015, 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[44]  Meeta Sharma Gupta,et al.  Voltage emergency prediction: Using signatures to reduce operating margins , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[45]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[46]  Pradip Bose,et al.  Voltage Noise in Multi-Core Processors: Empirical Characterization and Optimization Opportunities , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[47]  Margaret Martonosi,et al.  A dynamic compilation framework for controlling microprocessor energy and performance , 2005, 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'05).

[48]  Shidhartha Das,et al.  Harnessing Voltage Margins for Energy Efficiency in Multicore CPUs , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[49]  Margaret Martonosi,et al.  Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[50]  Eli Chiprout,et al.  A microarchitecture-based framework for pre- and post-silicon power delivery analysis , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[51]  Radu Teodorescu,et al.  Authenticache: Harnessing cache ECC for system authentication , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[52]  Margaret Martonosi,et al.  Control techniques to eliminate voltage emergencies in high performance processors , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[53]  Dimitris Gizopoulos,et al.  Micro-Viruses for Fast System-Level Voltage Margins Characterization in Multicore CPUs , 2018, 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[54]  Radu Teodorescu,et al.  Using ECC Feedback to Guide Voltage Speculation in Low-Voltage Processors , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[55]  Yale N. Patt,et al.  Predicting Performance Impact of DVFS for Realistic Memory Systems , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.