Autonomic power management with self-healing in server clusters under QoS constraints

The increasing use of server clusters has made their energy consumption an important issue. To address it, several power management techniques are being developed. In order to be useful, these techniques must address the performance and availability implications of reducing energy consumption. This paper presents a power management technique that maintains the quality of service (QoS) levels specified with service level agreements expressed as a threshold for a percentile of the response time. In addition, it provides self-healing by identifying when servers fail and automatically provisioning new servers. The technique is based on balancing the load so that it is concentrated in a small number of servers. For this, it only requires two utilization thresholds and models of performance and power consumption for the application executed in the server. It works in heterogeneous servers and provides overload protection. Several experiments carried out on a prototype show that the technique reduces energy consumption (up to 57.59 % compared to an always-on policy) while providing self-healing and maintaining the QoS.

[1]  Charles L. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[2]  Jerome A. Rolia,et al.  Resource pool management: Reactive versus proactive or let's be friends , 2009, Comput. Networks.

[3]  Thomas F. Wenisch,et al.  PowerNap: eliminating server idle power , 2009, ASPLOS.

[4]  Rong Ge,et al.  CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters , 2007, 2007 International Conference on Parallel Processing (ICPP 2007).

[5]  Aameek Singh,et al.  Shares and utilities based power consolidation in virtualized server environments , 2009, 2009 IFIP/IEEE International Symposium on Integrated Network Management.

[6]  Daniel Mossé,et al.  Stochastic approximation control of power and tardiness in a three-tier web-hosting cluster , 2010, ICAC '10.

[7]  Juan Li,et al.  An overview of energy efficiency techniques in cluster computing systems , 2013, Cluster Computing.

[8]  José Ranilla,et al.  Energy-efficient allocation of computing node slots in HPC clusters through parameter learning and hybrid genetic fuzzy system modeling , 2014, The Journal of Supercomputing.

[9]  S. Huang,et al.  Energy-Efficient Cluster Computing via Accurate Workload Characterization , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[10]  Laurent Lefèvre,et al.  A Runtime Framework for Energy Efficient HPC Systems without a Priori Knowledge of Applications , 2012, 2012 IEEE 18th International Conference on Parallel and Distributed Systems.

[11]  Mor Harchol-Balter,et al.  Optimal power allocation in server farms , 2009, SIGMETRICS '09.

[12]  Kevin Skadron,et al.  Multi-mode energy management for multi-tier server clusters , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[13]  Thomas F. Wenisch,et al.  Active Low-Power Modes for Main Memory with MemScale , 2012, IEEE Micro.

[14]  Daniel Wong,et al.  KnightShift: Scaling the Energy Proportionality Wall through Server-Level Heterogeneity , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[15]  Kirk W. Cameron,et al.  Memory-miser: a performance-constrained runtime system for power-scalable clusters , 2007, CF '07.

[16]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[17]  Andrzej Kochut,et al.  Dynamic Placement of Virtual Machines for Managing SLA Violations , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[18]  Javier García,et al.  Dynamic adaptation of response-time models for QoS management in autonomic systems , 2011, J. Syst. Softw..

[19]  Thomas F. Wenisch,et al.  DreamWeaver: architectural support for deep sleep , 2012, ASPLOS XVII.

[20]  Ricardo Bianchini,et al.  Barely alive memory servers: Keeping data active in a low-power state , 2012, JETC.

[21]  Rong Ge,et al.  Improvement of power-performance efficiency for high-end computing , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[22]  Ricardo Bianchini,et al.  Energy conservation in heterogeneous server clusters , 2005, PPoPP.

[23]  Mor Harchol-Balter,et al.  AutoScale: Dynamic, Robust Capacity Management for Multi-Tier Data Centers , 2012, TOCS.

[24]  Min Yeol Lim,et al.  Adaptive, transparent CPU scaling algorithms leveraging inter-node MPI communication regions , 2011, Parallel Comput..

[25]  Enrique V. Carrera,et al.  Load balancing and unbalancing for power and performance in cluster-based systems , 2001 .

[26]  Randy H. Katz,et al.  NapSAC: design and implementation of a power-proportional web cluster , 2010, CCRV.

[27]  Yu Zeng,et al.  Automatic Energy Status Controlling with Dynamic Voltage Scaling in Power-Aware High Performance Computing Cluster , 2011, 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies.

[28]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[29]  Daniel Wong,et al.  Implications of high energy proportional servers on cluster-wide energy proportionality , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[30]  E. N. Elnozahy,et al.  Energy-Efficient Server Clusters , 2002, PACS.

[31]  Steven Swanson,et al.  Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications , 2009, ASPLOS.

[32]  Prashant J. Shenoy,et al.  Dynamic Provisioning of Multi-tier Internet Applications , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[33]  Nagarajan Kandasamy,et al.  Power and performance management of virtualized computing environments via lookahead control , 2008, 2008 International Conference on Autonomic Computing.

[34]  Anand Sivasubramaniam,et al.  Managing server energy and operational costs in hosting centers , 2005, SIGMETRICS '05.

[35]  Rajkumar Buyya,et al.  Energy-Efficient Management of Data Center Resources for Cloud Computing: A Vision, Architectural Elements, and Open Challenges , 2010, PDPTA.

[36]  Karsten Schwan,et al.  VirtualPower: coordinated power management in virtualized enterprise systems , 2007, SOSP.

[37]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[38]  Feng Pan,et al.  Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications , 2007, IEEE Transactions on Parallel and Distributed Systems.

[39]  Albert Y. Zomaya,et al.  A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems , 2010, Adv. Comput..

[40]  Mahmut T. Kandemir,et al.  Reducing power with performance constraints for parallel sparse applications , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[41]  Steven J. Johnston,et al.  Iridis-pi: a low-cost, compact demonstration cluster , 2014, Cluster Computing.

[42]  Feng Zhao,et al.  Energy aware consolidation for cloud computing , 2008, CLUSTER 2008.