Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning

Electrical power management in large-scale IT systems such as commercial data-centers is an application area of rapidly growing interest from both an economic and ecological perspective, with billions of dollars and millions of metric tons of CO2 emissions at stake annually. Businesses want to save power without sacrificing performance. This paper presents a reinforcement learning approach to simultaneous online management of both performance and power consumption. We apply RL in a realistic laboratory testbed using a Blade cluster and dynamically varying HTTP workload running on a commercial web applications middleware platform. We embed a CPU frequency controller in the Blade servers' firmware, and we train policies for this controller using a multi-criteria reward signal depending on both application performance and CPU power consumption. Our testbed scenario posed a number of challenges to successful use of RL, including multiple disparate reward functions, limited decision sampling rates, and pathologies arising when using multiple sensor readings as state variables. We describe innovative practical solutions to these challenges, and demonstrate clear performance improvements over both hand-designed policies as well as obvious "cookbook" RL implementations.

[1]  Vincent W. Freeh,et al.  Boosting Data Center Performance Through Non-Uniform Power Allocation , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[2]  Stuart J. Russell,et al.  Q-Decomposition for Reinforcement Learning Agents , 2003, ICML.

[3]  Xiaorui Wang,et al.  Server-Level Power Control , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[4]  Rajarshi Das,et al.  A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation , 2006, 2006 IEEE International Conference on Autonomic Computing.

[5]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[6]  Mark S. Squillante,et al.  Internet traffic: periodicity, tail behavior, and performance implications , 2000 .

[7]  Nagarajan Kandasamy,et al.  Adaptive Performance Control of Computing Systems via Distributed Cooperative Control: Application to Power Management in Computing Clusters , 2006, 2006 IEEE International Conference on Autonomic Computing.

[8]  Anand Sivasubramaniam,et al.  Managing server energy and operational costs in hosting centers , 2005, SIGMETRICS '05.

[9]  Rajarshi Das,et al.  Coordinating Multiple Autonomic Managers to Achieve Specified Power-Performance Tradeoffs , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[10]  Pieter Abbeel,et al.  An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.

[11]  Csaba Szepesvári,et al.  Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path , 2006, COLT.

[12]  Virgílio A. F. Almeida,et al.  Capacity Planning for Web Performance: Metrics, Models, and Methods , 1998 .

[13]  Leemon C. Baird,et al.  Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[14]  Csaba Szepesvári,et al.  Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.

[15]  Pieter Abbeel,et al.  Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.