Energy Management in Data Centers with Server Setup Delay: A Semi-MDP Approximation

The energy management schemes in multi-server data centers with setup time mostly consider thresholds on the number of idle servers or waiting jobs to switch servers on or off. An optimal energy management policy can be characterized as a Markov decision process (MDP) at large, given that the system parameters evolve Markovian. The resulting optimal reward can be defined as the weighted sum of mean power usage and mean delay of requested jobs. For large-scale data centers however, these models become intractable due to the colossal state-action space, thus making conventional algorithms inefficient in finding the optimal policy. In this paper, we propose an approximate semi-MDP (SMDP) approach, known as ‘multi-level SMDP ’, based on state aggregation and Markovian analysis of the system behavior. Rather than averaging the transition probabilities of aggregated states as in typical methods, we introduce an approximate Markovian framework for calculating the transition probabilities of the proposed multi-level SMDP accurately. Moreover, near-optimal performance can be attained at the expense of increased state-space dimensionality by tuning the number of levels in the multi-level approach. Simulation results show that the proposed approach reduces the SMDP size while yielding better rewards as against existing fixed threshold-based policies and aggregation methods.

[1]  Alan Scheller-Wolf,et al.  Exact analysis of the M/M/k/setup class of Markov chains via recursive renewal reward , 2013, SIGMETRICS '13.

[2]  Zhisheng Niu,et al.  An Optimal Hysteretic Control Policy for Energy Saving in Cloud Computing , 2011, 2011 IEEE Global Telecommunications Conference - GLOBECOM 2011.

[3]  Samuli Aalto,et al.  Optimal energy-aware control policies for FIFO servers , 2016, Perform. Evaluation.

[4]  Tuan Phung-Duc,et al.  Delay performance of data-center queue with setup policy and abandonment , 2020, Ann. Oper. Res..

[5]  Zhenlong Li,et al.  Energy Efficient Scheduling of Servers with Multi-Sleep Modes for Cloud Data Center , 2020, IEEE Transactions on Cloud Computing.

[6]  Bin Wang,et al.  Modeling Heterogeneous Virtual Machines on IaaS Data Centers , 2015, IEEE Communications Letters.

[7]  Franck Cappello,et al.  Characterizing and modeling cloud applications/jobs on a Google data center , 2014, The Journal of Supercomputing.

[8]  Yonggang Wen,et al.  Data Center Energy Consumption Modeling: A Survey , 2016, IEEE Communications Surveys & Tutorials.

[9]  Mor Harchol-Balter,et al.  Server farms with setup costs , 2010, Perform. Evaluation.

[10]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[11]  Doina Precup,et al.  Bounding Performance Loss in Approximate MDP Homomorphisms , 2008, NIPS.

[12]  Marcus Hutter,et al.  Extreme state aggregation beyond Markov decision processes , 2016, Theor. Comput. Sci..

[13]  Stephen C. Adams,et al.  On the Practical Art of State Definitions for Markov Decision Process Construction , 2018, IEEE Access.

[14]  Tamás Linder,et al.  Finite state approximations of Markov decision processes with general state and action spaces , 2015, 2015 American Control Conference (ACC).

[15]  Bin Wang,et al.  Modeling Active Virtual Machines on IaaS Clouds Using an M/G/m/m+K Queue , 2016, IEEE Transactions on Services Computing.

[16]  B. Krogh,et al.  State aggregation in Markov decision processes , 2002, Proceedings of the 41st IEEE Conference on Decision and Control, 2002..

[17]  Qinmin Yang,et al.  Probability Based Online Algorithm for Switch Operation of Energy Efficient Data Center , 2019 .

[18]  Hind Castel-Taleb,et al.  Generating Optimal Thresholds in a Hysteresis Queue: Application to a Cloud Model , 2019, 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).

[19]  Tuan Phung-Duc,et al.  Exact solutions for M/M/c/Setup queues , 2014, Telecommun. Syst..

[20]  Michael L. Littman,et al.  Near Optimal Behavior via Approximate State Abstraction , 2016, ICML.

[21]  Samuli Aalto,et al.  Near-optimal dispatching policy for energy-aware server clusters , 2019, Perform. Evaluation.

[22]  Esa Hyytiä,et al.  Dynamic Control of Running Servers , 2018, MMB.

[23]  Douglas G. Down,et al.  Exact Analysis of Energy-Aware Multiserver Queueing Systems with Setup Times , 2016, MASCOTS.

[24]  Andres Kwasinski,et al.  Reducing Power Consumption of Datacenter Networks with 60GHz Wireless Server-to-Server Links , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[25]  Yuhui Deng,et al.  QoS Promotion in Energy-Efficient Datacenters Through Peak Load Scheduling , 2018, IEEE Transactions on Cloud Computing.

[26]  Kishor S. Trivedi,et al.  A scalable availability model for Infrastructure-as-a-Service cloud , 2011, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN).