Statistical profiling-based techniques for effective power provisioning in data centers

Current capacity planning practices based on heavy over-provisioning of power infrastructure hurt (i) the operational costs of data centers as well as (ii) the computational work they can support. We explore a combination of statistical multiplexing techniques to improve the utilization of the power hierarchy within a data center. At the highest level of the power hierarchy, we employ controlled underprovisioning and over-booking of power needs of hosted workloads. At the lower levels, we introduce the novel notion of soft fuses to flexibly distribute provisioned power among hosted workloads based on their needs. Our techniques are built upon a measurement-driven profiling and prediction framework to characterize key statistical properties of the power needs of hosted workloads and their aggregates. We characterize the gains in terms of the amount of computational work (CPU cycles) per provisioned unit of power Computation per Provisioned Watt (CPW). Our technique is able to double the CPWoffered by a Power Distribution Unit (PDU) running the e-commerce benchmark TPC-W compared to conventional provisioning practices. Over-booking the PDU by 10% based on tails of power profiles yields a further improvement of 20%. Reactive techniques implemented on our Xen VMM-based servers dynamically modulate CPU DVFS states to ensure power draw below the limits imposed by soft fuses. Finally, information captured in our profiles also provide ways of controlling application performance degradation despite overbooking. The 95th percentile of TPC-W session response time only grew from 1.59 sec to 1.78 sec--a degradation of 12%.

[1]  Fan Zhang,et al.  A statistical approach to predictive detection , 2001, Comput. Networks.

[2]  Krishna Kant,et al.  Overload Control Mechanisms for Web Servers , 2001 .

[3]  Karsten Schwan,et al.  VirtualPower: coordinated power management in virtualized enterprise systems , 2007, SOSP.

[4]  Anand Sivasubramaniam,et al.  QDSL: a queuing model for systems with differential service levels , 2008, SIGMETRICS '08.

[5]  Xiaorui Wang,et al.  Server-Level Power Control , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[6]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[7]  John Paul Shen,et al.  Mitigating Amdahl's law through EPI throttling , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[8]  Xiaorui Wang,et al.  Cluster-level feedback power control for performance optimization , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[9]  Vanish Talwar,et al.  No "power" struggles: coordinated multi-level power management for the data center , 2008, ASPLOS.

[10]  Prashant J. Shenoy,et al.  Resource overbooking and application profiling in shared hosting platforms , 2002, OSDI '02.

[11]  Karthick Rajamani,et al.  A performance-conserving approach for reducing peak power consumption in server systems , 2005, ICS '05.

[12]  Amin Vahdat,et al.  ECOSystem: managing energy as a first class operating system resource , 2002, ASPLOS X.

[13]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[14]  Walter Willinger,et al.  Self-Similar Network Traffic and Performance Evaluation , 2000 .

[15]  Wayne D. Smith,et al.  TPC-W: Benchmarking An Ecommerce Solution , 2001 .

[16]  Ricardo Bianchini,et al.  C-Oracle: Predictive thermal management for data centers , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[17]  Chaiwat Oottamakorn,et al.  Statistical service assurances for traffic scheduling algorithms , 2000, IEEE Journal on Selected Areas in Communications.

[18]  Andrew Warfield,et al.  Xen and the art of virtualization , 2003, SOSP '03.

[19]  UrgaonkarBhuvan,et al.  Resource overbooking and application profiling in shared hosting platforms , 2002 .

[20]  Murali Annavaram,et al.  Mitigating Amdahl's Law through EPI Throttling , 2005, ISCA 2005.

[21]  Frank Bellosa,et al.  Process cruise control: event-driven clock scaling for dynamic power management , 2002, CASES '02.

[22]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[23]  Anand Sivasubramaniam,et al.  Profiling, Prediction, and Capping of Power Consumption in Consolidated Environments , 2008, 2008 IEEE International Symposium on Modeling, Analysis and Simulation of Computers and Telecommunication Systems.

[24]  Barry C. Smith,et al.  Yield Management at American Airlines , 1992 .

[25]  David E. Irwin,et al.  Ensemble-level Power Management for Dense Blade Servers , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[26]  Frank Bellosa,et al.  Energy Management for Hypervisor-Based Virtual Machines , 2007, USENIX Annual Technical Conference.