Leveraging stored energy for handling power emergencies in aggressively provisioned datacenters

Datacenters spend $10-25 per watt in provisioning their power infrastructure, regardless of the watts actually consumed. Since peak power needs arise rarely, provisioning power infrastructure for them can be expensive. One can, thus, aggressively under-provision infrastructure assuming that simultaneous peak draw across all equipment will happen rarely. The resulting non-zero probability of emergency events where power needs exceed provisioned capacity, however small, mandates graceful reaction mechanisms to cap the power draw instead of leaving it to disruptive circuit breakers/fuses. Existing strategies for power capping use temporal knobs local to a server that throttle the rate of execution (using power modes), and/or spatial knobs that redirect/migrate excess load to regions of the datacenter with more power headroom. We show these mechanisms to have performance degrading ramifications, and propose an entirely orthogonal solution that leverages existing UPS batteries to temporarily augment the utility supply during emergencies. We build an experimental prototype to demonstrate such power capping on a cluster of 8 servers, each with an individual battery, and implement several online heuristics in the context of different datacenter workloads to evaluate their effectiveness in handling power emergencies. We show that: (i) our battery-based solution can handle emergencies of short duration on its own, (ii) supplement existing reaction mechanisms to enhance their efficacy for longer emergencies, and (iii) battery even provide feasible options when other knobs do not suffice.

[1]  Anand Sivasubramaniam,et al.  Managing server energy and operational costs in hosting centers , 2005, SIGMETRICS '05.

[2]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[3]  Frank Bellosa,et al.  Process cruise control: event-driven clock scaling for dynamic power management , 2002, CASES '02.

[4]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[5]  Ron Cohen,et al.  Hadoop Map-reduce , 2010 .

[6]  Qingyuan Deng,et al.  MemScale: active low-power modes for main memory , 2011, ASPLOS XVI.

[7]  Gargi Dasgupta,et al.  BrownMap: Enforcing Power Budget in Shared Data Centers , 2010, Middleware.

[8]  James R. Hamilton,et al.  Internet-scale service infrastructure efficiency , 2009, ISCA '09.

[9]  Anand Sivasubramaniam,et al.  Towards realizing a low cost and highly available datacenter power infrastructure , 2011, HotPower '11.

[10]  Thomas F. Wenisch,et al.  Power routing: dynamic power provisioning in the data center , 2010, ASPLOS XV.

[11]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[12]  Karsten Schwan,et al.  Robust and flexible power-proportional storage , 2010, SoCC '10.

[13]  Jeffrey S. Chase,et al.  Making Scheduling "Cool": Temperature-Aware Workload Placement in Data Centers , 2005, USENIX Annual Technical Conference, General Track.

[14]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[15]  Lakshmi Ganesh,et al.  Unleash Stranded Power in Data Centers with RackPacker , 2009 .

[16]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition , 2013, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition.

[17]  Anand Sivasubramaniam,et al.  Statistical profiling-based techniques for effective power provisioning in data centers , 2009, EuroSys '09.

[18]  Marshall F Chalverus,et al.  The Black Swan: The Impact of the Highly Improbable , 2007 .

[19]  Nassim Nicholas Taleb,et al.  The Black Swan: The Impact of the Highly Improbable , 2007 .

[20]  Xiaorui Wang,et al.  Server-Level Power Control , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[21]  T. N. Vijaykumar,et al.  Joint optimization of idle and cooling power in data centers while maintaining response time , 2010, ASPLOS XV.

[22]  David E. Irwin,et al.  Ensemble-level Power Management for Dense Blade Servers , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[23]  Jie Liu,et al.  Power Budgeting for Virtualized Data Centers , 2011, USENIX Annual Technical Conference.

[24]  Yuanyuan Zhou,et al.  Hibernator: helping disk arrays sleep through the winter , 2005, SOSP '05.

[25]  Kai Ma,et al.  Scalable power control for many-core architectures running multi-threaded applications , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[26]  Xiaorui Wang,et al.  Cluster-level feedback power control for performance optimization , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[27]  Mahadev Satyanarayanan,et al.  Managing battery lifetime with energy-aware adaptation , 2004, TOCS.

[28]  Fumin Zhang,et al.  A Dynamic Battery Model for Co-design in Cyber-Physical Systems , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems Workshops.

[29]  Thomas F. Wenisch,et al.  Power management of online data-intensive services , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[30]  Xiaorui Wang,et al.  How much power oversubscription is safe and allowed in data centers , 2011, ICAC '11.

[31]  Anand Sivasubramaniam,et al.  Benefits and limitations of tapping into stored energy for datacenters , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[32]  Amir Michael,et al.  Facebook: The open compute project , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).

[33]  Mor Harchol-Balter,et al.  Optimal power allocation in server farms , 2009, SIGMETRICS '09.

[34]  Anand Sivasubramaniam,et al.  Optimal power cost management using stored energy in data centers , 2011, PERV.

[35]  Vanish Talwar,et al.  No "power" struggles: coordinated multi-level power management for the data center , 2008, ASPLOS.

[36]  Prashant J. Shenoy,et al.  Resource overbooking and application profiling in shared hosting platforms , 2002, OSDI '02.

[37]  Karthick Rajamani,et al.  A performance-conserving approach for reducing peak power consumption in server systems , 2005, ICS '05.

[38]  Wayne D. Smith,et al.  TPC-W: Benchmarking An Ecommerce Solution , 2001 .

[39]  UrgaonkarBhuvan,et al.  Resource overbooking and application profiling in shared hosting platforms , 2002 .

[40]  Vincent W. Freeh,et al.  Safe Overprovisioning: Using Power Limits to Increase Aggregate Throughput , 2004, PACS.

[41]  Enrique V. Carrera,et al.  Load balancing and unbalancing for power and performance in cluster-based systems , 2001 .

[42]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[43]  Lachlan L. H. Andrew,et al.  Dynamic Right-Sizing for Power-Proportional Data Centers , 2011, IEEE/ACM Transactions on Networking.

[44]  Ricardo Bianchini,et al.  C-Oracle: Predictive thermal management for data centers , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[45]  Amin Vahdat,et al.  ECOSystem: managing energy as a first class operating system resource , 2002, ASPLOS X.