SprintCon: Controllable and Efficient Computational Sprinting for Data Center Servers

Computational sprinting is an effective mechanism to temporarily boost the performance of data center servers. However, given the great effect on performance improvement, how to make the sprinting process controllable and how to maximize the sprinting efficiency have not been well discussed yet. Those can be significant problems for a data center when computational sprinting is needed for more than a few minutes, since it requires the support of energy storage, whose capacity is limited. The control and efficiency of sprinting not only involve how fast to run servers and how to allocate resources to co-running workloads, but also the impact on power overload, and how to handle the overload with circuit breakers and energy storage to ensure power safety. Different workloads can impact sprinting in different ways, and hence efficient sprinting requires workload-specific strategies. In this paper, we propose SprintCon to realize controllable and efficient computational sprinting for data center servers. SprintCon mainly consists of a power load allocator and two different power controllers. The allocator analyzes how to divide the power load to different power sources. The server power controller adapts the CPU cores that process batch workloads, to improve the efficiency in terms of computing, energy and cost. The UPS power controller dynamically adjusts the discharge rate of UPS energy storage to satisfy the time-varying power demand of interactive workloads, and ensure power safety. The experiment results show that compared to state-of-the-art solutions, SprintCon can achieve 6-56% better computing performance and up to 87% less demand of energy storage.

[1]  Quan Chen,et al.  PowerChief: Intelligent power allocation for multi-stage applications to improve responsiveness on power constrained CMP , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).

[2]  Marios C. Papaefthymiou,et al.  Computational sprinting , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[3]  Xu Zhou,et al.  GreenSprint: Effective Computational Sprinting in Green Data Centers , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[4]  Quan Chen,et al.  EEWA: Energy-Efficient Workload-Aware Task Scheduling in Multi-core Architectures , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.

[5]  Jan M. Maciejowski,et al.  Predictive control : with constraints , 2002 .

[6]  Xiaorui Wang,et al.  MIMO Power Control for High-Density Servers in an Enclosure , 2010, IEEE Transactions on Parallel and Distributed Systems.

[7]  Xue Li,et al.  Coordinating processor and main memory for efficientserver power control , 2011, ICS '11.

[8]  Marios C. Papaefthymiou,et al.  Computational sprinting on a hardware/software testbed , 2013, ASPLOS '13.

[9]  Kevin Skadron,et al.  Multi-mode energy management for multi-tier server clusters , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[10]  Adam Wierman,et al.  Renewable and cooling aware workload management for sustainable data centers , 2012, SIGMETRICS '12.

[11]  Benjamin C. Lee,et al.  The Computational Sprinting Game , 2016, ASPLOS.

[12]  Minyi Guo,et al.  Power Grab in Aggressively Provisioned Data Centers: What is the Risk and What Can Be Done About It , 2018, 2018 IEEE 36th International Conference on Computer Design (ICCD).

[13]  Xiaorui Wang,et al.  How much power oversubscription is safe and allowed in data centers , 2011, ICAC '11.

[14]  Xiaorui Wang,et al.  DPPC: Dynamic Power Partitioning and Control for Improved Chip Multiprocessor Performance , 2014, IEEE Transactions on Computers.

[15]  Vanish Talwar,et al.  No "power" struggles: coordinated multi-level power management for the data center , 2008, ASPLOS.

[16]  Josep Torrellas,et al.  Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors , 2008, 2008 International Symposium on Computer Architecture.

[17]  Lingjia Tang,et al.  Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.

[18]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[19]  Kai Ma,et al.  Temperature-constrained power control for chip multiprocessors with online model estimation , 2009, ISCA '09.

[20]  Xiaorui Wang,et al.  Power capping: a prelude to power shifting , 2008, Cluster Computing.

[21]  Xiaorui Wang,et al.  Data Center Sprinting: Enabling Computational Sprinting at the Data Center Level , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[22]  Kai Ma,et al.  Exploiting thermal energy storage to reduce data center capital and operating expenses , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[23]  Thomas F. Wenisch,et al.  CoScale: Coordinating CPU and Memory System DVFS in Server Systems , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[24]  Guillaume Pierre,et al.  Wikipedia workload analysis for decentralized hosting , 2009, Comput. Networks.

[25]  Houman Homayoun,et al.  Managing distributed UPS energy for effective power capping in data centers , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[26]  Kai Ma,et al.  Hybrid Energy Storage with Supercapacitor for Cost-Efficient Data Center Power Shaving and Capping , 2017, IEEE Transactions on Parallel and Distributed Systems.

[27]  Hongyi Wu,et al.  Shift sprinting: Fine-grained temperature-aware NoC-based MCSoC architecture in dark silicon age , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[28]  Xiaorui Wang,et al.  SHIP: Scalable Hierarchical Power Control for Large-Scale Data Centers , 2009, 2009 18th International Conference on Parallel Architectures and Compilation Techniques.

[29]  Xiaorui Wang,et al.  Cluster-level feedback power control for performance optimization , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[30]  Zhenhua Wang,et al.  Power Attack Defense: Securing Battery-Backed Data Centers , 2016, ISCA.