A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning

Automatic decision-making approaches, such as reinforcement learning (RL), have been applied to (partially) solve the resource allocation problem adaptively in the cloudcomputing system. However, a complete cloud resource allocation framework exhibits high dimensions in state and action spaces, which prohibit the usefulness of traditional RL techniques. In addition, high power consumption has become one of the critical concerns in design and control of cloud computing systems, which degrades system reliability and increases cooling cost. An effective dynamic power management (DPM) policy should minimize power consumption while maintaining performance degradationwithin an acceptable level. Thus, a joint virtual machine (VM) resource allocation and power management framework is critical to the overall cloud computing system. Moreover, novel solution framework is necessary to address the even higher dimensions in state and action spaces. In this paper, we propose a novel hierarchical framework forsolving the overall resource allocation and power management problem in cloud computing systems. The proposed hierarchical framework comprises a global tier for VM resource allocation to the servers and a local tier for distributed power management of local servers. The emerging deep reinforcement learning (DRL) technique, which can deal with complicated control problems with large state space, is adopted to solve the global tier problem. Furthermore, an autoencoder and a novel weight sharing structure are adopted to handle the high-dimensional state space and accelerate the convergence speed. On the other hand, the local tier of distributed server power managements comprises an LSTM based workload predictor and a model-free RL based power manager, operating in a distributed manner. Experiment results using actual Google cluster traces showthat our proposed hierarchical framework significantly savesthe power consumption and energy usage than the baselinewhile achieving no severe latency degradation. Meanwhile, the proposed framework can achieve the best trade-off between latency and power/energy consumption in a server cluster.

[1]  Thomas F. Wenisch,et al.  PowerNap: eliminating server idle power , 2009, ASPLOS.

[2]  John L. Gustafson,et al.  Little's Law , 2011, Encyclopedia of Parallel Computing.

[3]  Massoud Pedram,et al.  Dynamic Power Management under Uncertain Information , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[4]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[5]  Allen C.-H. Wu,et al.  A predictive system shutdown method for energy saving of event-driven computation , 1997, 1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD).

[6]  Luca Benini,et al.  Event-driven power management , 2001, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[7]  David Levine,et al.  Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning , 2007, NIPS.

[8]  R. Syski,et al.  Fundamentals of Queueing Theory , 1999, Technometrics.

[9]  Le Yi Wang,et al.  VCONF: a reinforcement learning approach to virtual machines auto-configuration , 2009, ICAC '09.

[10]  Wei Liu,et al.  Adaptive power management using reinforcement learning , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[11]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[12]  Enda Barrett,et al.  Applying reinforcement learning towards automating resource allocation and application scalability in the cloud , 2013, Concurr. Comput. Pract. Exp..

[13]  Mani B. Srivastava,et al.  Predictive system shutdown and other architectural techniques for energy efficient programmable computation , 1996, IEEE Trans. Very Large Scale Integr. Syst..

[14]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[15]  Alex Graves,et al.  Grid Long Short-Term Memory , 2015, ICLR.

[16]  G. Dhiman,et al.  Dynamic Power Management Using Machine Learning , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[17]  Luca Benini,et al.  A survey of design techniques for system-level dynamic power management , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[18]  Richard S. Sutton,et al.  Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[19]  Michael O. Duff,et al.  Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.

[20]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[21]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[22]  Ivo Bolsens,et al.  Proceedings of the conference on Design, Automation & Test in Europe , 2000 .

[23]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[24]  Massoud Pedram,et al.  Deriving a near-optimal power management policy using model-free reinforcement learning and Bayesian classification , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[25]  Carl M. Harris,et al.  Fundamentals of Queueing Theory: Gross/Fundamentals of Queueing Theory , 2008 .

[26]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[27]  Kristina Lerman,et al.  Resource allocation in the grid using reinforcement learning , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[28]  Massoud Pedram,et al.  A Reinforcement Learning-Based Power Management Framework for Green Computing Data Centers , 2016, 2016 IEEE International Conference on Cloud Engineering (IC2E).

[29]  Chita R. Das,et al.  Towards characterizing cloud backend workloads: insights from Google compute clusters , 2010, PERV.

[30]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[31]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[32]  Xifeng Yan,et al.  Workload characterization and prediction in the cloud: A multiple time series approach , 2012, 2012 IEEE Network Operations and Management Symposium.

[33]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[34]  Isis Truck,et al.  Using Reinforcement Learning for Autonomic Resource Allocation in Clouds: towards a fully automated workflow , 2011 .

[35]  Les E. Atlas,et al.  Recurrent neural networks and robust time series prediction , 1994, IEEE Trans. Neural Networks.

[36]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[37]  Ying Tan,et al.  Stochastic Modeling and Optimization for Robust Power Management in a Partially Observable System , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.