Data Center HVAC Control Harnessing Flexibility Potential via Real-Time Pricing Cost Optimization Using Reinforcement Learning

With increasing electricity prices, cost savings through load shifting are becoming increasingly important for energy end users. While dynamic pricing encourages customers to shift demand to low price periods, the nonstationary and highly volatile nature of electricity prices poses a significant challenge to energy management systems. In this article, we investigate the flexibility potential of data centers by optimizing heating, ventilation, and air conditioning systems with a general model-free reinforcement learning (RL) approach. Since the soft actor-critic algorithm with feedforward networks did not work satisfactorily in this scenario, we propose instead a parameterization with a recurrent neural network architecture to successfully handle spot-market price data. The past is encoded into a hidden state, which provides a way to learn the temporal dependencies in the observations and highly volatile rewards. The proposed method is then evaluated in experiments on a simulated data center. Considering real temperature and price signals over multiple years, the results show a cost reduction compared to a proportional, integral and derivative controller while maintaining the temperature of the data center within the desired operating ranges. In this context, this work demonstrates an innovative and applicable RL approach that incorporates complex economic objectives into agent decision-making. The proposed control method can be integrated into various Internet of Things-based smart building solutions for energy management.

[1]  Yi Zhang,et al.  Toward Smart Multizone HVAC Control by Combining Context-Aware System and Deep Reinforcement Learning , 2022, IEEE Internet of Things Journal.

[2]  Liang-liang Chen,et al.  MBRL-MC: An HVAC Control Approach via Combining Model-Based Deep Reinforcement Learning and Model Predictive Control , 2022, IEEE Internet of Things Journal.

[3]  G. Biswas,et al.  Deep Reinforcement Learning Control for Non-stationary Building Energy Management , 2022, Energy and Buildings.

[4]  Nikolaos E. Koltsaklis,et al.  Smart Home’s Energy Management Through a Clustering-Based Reinforcement Learning Approach , 2022, IEEE Internet of Things Journal.

[5]  Suli Zou,et al.  Data-driven stochastic energy management of multi energy system using deep reinforcement learning , 2022, Energy.

[6]  S. Bøgh,et al.  Data-driven Offline Reinforcement Learning for HVAC-systems , 2022, Energy.

[7]  G. Strbac,et al.  Safe reinforcement learning for real-time automatic control in a smart energy-hub , 2022, Applied Energy.

[8]  L. Helsen,et al.  Reinforced model predictive control (RL-MPC) for building energy management , 2022, Applied Energy.

[9]  Lorenz Wellhausen,et al.  Learning robust perceptive locomotion for quadrupedal robots in the wild , 2022, Science Robotics.

[10]  Xiangtian Deng,et al.  Towards optimal HVAC control in non-stationary building environments combining active change detection and deep reinforcement learning , 2022, Building and Environment.

[11]  Yang Li,et al.  Reinforcement learning of room temperature set-point of thermal storage air-conditioning system with demand response , 2022, Energy and Buildings.

[12]  Anand Krishnan Prakash,et al.  Controlling distributed energy resources via deep reinforcement learning for load flexibility and energy efficiency , 2021, Applied Energy.

[13]  Alfonso Capozzoli,et al.  Data-driven district energy management with surrogate models and deep reinforcement learning , 2021, Applied Energy.

[14]  Malte Lehna,et al.  A Reinforcement Learning Approach for the Continuous Electricity Market of Germany: Trading from the Perspective of a Wind Park Operator , 2021, Energy and AI.

[15]  Marco Biemann,et al.  Addressing partial observability in reinforcement learning for energy management , 2021, BuildSys@SenSys.

[16]  R. Salakhutdinov,et al.  Recurrent Model-Free RL Can Be a Strong Baseline for Many POMDPs , 2021, ICML.

[17]  Fabian Scheller,et al.  Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control , 2021 .

[18]  Haochen Hua,et al.  Privacy Preserving Load Control of Residential Microgrid via Deep Reinforcement Learning , 2021, IEEE Transactions on Smart Grid.

[19]  Renzhi Lu,et al.  Deep Reinforcement Learning-Based Demand Response for Smart Facilities Energy Management , 2021, IEEE Transactions on Industrial Electronics.

[20]  José R. Vázquez-Canteli,et al.  Coordinated energy management for a cluster of buildings through deep reinforcement learning , 2021 .

[21]  Eleni Mangina,et al.  Development of a Soft Actor Critic Deep Reinforcement Learning Approach for Harnessing Energy Flexibility in a Large Office Building , 2021, ArXiv.

[22]  S. Levine,et al.  Maximum Entropy RL (Provably) Solves Some Robust RL Problems , 2021, ICLR.

[23]  Huan Zhao,et al.  Hybrid-Model-Based Deep Reinforcement Learning for Heating, Ventilation, and Air-Conditioning Control , 2021, Frontiers in Energy Research.

[24]  Marcus Hutter,et al.  Counterfactual Credit Assignment in Model-Free Reinforcement Learning , 2020, ICML.

[25]  Na Luo,et al.  Prototype energy models for data centers , 2020, Energy and Buildings.

[26]  Z. Zhai,et al.  State-of-the-art on thermal energy storage technologies in data center , 2020 .

[27]  Michael Wetter,et al.  All you need to know about model predictive control for buildings , 2020, Annu. Rev. Control..

[28]  B. Schutter,et al.  Forecasting day-ahead electricity prices: A review of state-of-the-art algorithms, best practices and an open-access benchmark , 2020, Applied Energy.

[29]  T. Mcdermott,et al.  Automated Control of Transactive HVACs in Energy Distribution Systems , 2020, IEEE Transactions on Smart Grid.

[30]  Linquan Bai,et al.  Online pricing of demand response based on long short-term memory and reinforcement learning , 2020 .

[31]  Zhe Wang,et al.  Reinforcement learning for building controls: The opportunities and challenges , 2020, Applied Energy.

[32]  Zhong Fan,et al.  Deep Reinforcement Learning-Based Energy Storage Arbitrage With Accurate Lithium-Ion Battery Degradation Model , 2020, IEEE Transactions on Smart Grid.

[33]  Semiha Ergan,et al.  Towards optimal control of air handling units using deep reinforcement learning and recurrent neural network , 2020 .

[34]  Jakub W. Pachocki,et al.  Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.

[35]  Wojciech M. Czarnecki,et al.  Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.

[36]  V. Prasanna,et al.  Building HVAC Scheduling Using Reinforcement Learning via Neural Network Based Model Approximation , 2019, BuildSys@SenSys.

[37]  Goran Strbac,et al.  Multi-Period and Multi-Spatial Equilibrium Analysis in Imperfect Electricity Markets: A Novel Multi-Agent Deep Reinforcement Learning Approach , 2019, IEEE Access.

[38]  Yulong Zou,et al.  Deep Reinforcement Learning for Smart Home Energy Management , 2019, IEEE Internet of Things Journal.

[39]  Guangjie Han,et al.  Characteristics of Co-Allocated Online Services and Batch Jobs in Internet Data Centers: A Case Study From Alibaba Cloud , 2019, IEEE Access.

[40]  José R. Vázquez-Canteli,et al.  Reinforcement learning for demand response: A review of algorithms and modeling techniques , 2019, Applied Energy.

[41]  Pedro S. Moura,et al.  A review on energy efficiency and demand response with focus on small and medium data centers , 2018, Energy Efficiency.

[42]  Pedro S. Moura,et al.  A review on energy efficiency and demand response with focus on small and medium data centers , 2018, Energy Efficiency.

[43]  Yan Wu,et al.  Optimizing agent behavior over long time scales by transporting value , 2018, Nature Communications.

[44]  Rémi Munos,et al.  Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.

[45]  Giovanni De Magistris,et al.  Reinforcement Learning Testbed for Power-Consumption Optimization , 2018, ArXiv.

[46]  Sepp Hochreiter,et al.  RUDDER: Return Decomposition for Delayed Rewards , 2018, NeurIPS.

[47]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[48]  Yuval Tassa,et al.  Maximum a Posteriori Policy Optimisation , 2018, ICLR.

[49]  Kuang-Ching Wang,et al.  Review of Internet of Things (IoT) in Electric Power and Energy Systems , 2018, IEEE Internet of Things Journal.

[50]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[51]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[52]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[53]  Yonggang Wen,et al.  Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning , 2017, IEEE Transactions on Cybernetics.

[54]  Bart De Schutter,et al.  Residential Demand Response of Thermostatically Controlled Loads Using Batch Reinforcement Learning , 2017, IEEE Transactions on Smart Grid.

[55]  Biao Huang,et al.  A Long-Short Term Memory Recurrent Neural Network Based Reinforcement Learning Controller for Office Heating Ventilation and Air Conditioning Systems , 2017 .

[56]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[57]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[58]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[59]  Dale Schuurmans,et al.  Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.

[60]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[61]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[62]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[63]  Sergey Levine,et al.  High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.

[64]  Ronnie Belmans,et al.  Learning Agent for a Heat-Pump Thermostat With a Set-Back Strategy Using Model-Free Reinforcement Learning , 2015, ArXiv.

[65]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[66]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[67]  Ralph Newman,et al.  Agency for the Cooperation of Energy Regulators , 2014 .

[68]  Adam Wierman,et al.  Opportunities and challenges for data center demand response , 2014, International Green Computing Conference.

[69]  Adam Wierman,et al.  Pricing data center demand response , 2014, SIGMETRICS '14.

[70]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[71]  R. Weron Electricity price forecasting: A review of the state-of-the-art with a look into the future , 2014 .

[72]  Warren B. Powell,et al.  An Intelligent Battery Controller Using Bias-Corrected Q-learning , 2012, AAAI.

[73]  Bianca Schroeder,et al.  Temperature management in data centers: why some (might) like it hot , 2012, SIGMETRICS '12.

[74]  Tore Hägglund,et al.  Advanced PID Control , 2005 .

[75]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[76]  Karl Johan Åström,et al.  Optimal control of Markov processes with incomplete state information , 1965 .

[77]  Kazuteru Miyazaki,et al.  Home Energy Management Algorithm Based on Deep Reinforcement Learning Using Multistep Prediction , 2021, IEEE Access.

[78]  Helia Zandi,et al.  Intelligent multi-zone residential HVAC control strategy based on deep reinforcement learning , 2021 .

[79]  A. Gleave,et al.  Stable-Baselines3: Reliable Reinforcement Learning Implementations , 2021, J. Mach. Learn. Res..

[80]  Warrren B Powell REINFORCEMENT LEARNING AND STOCHASTIC OPTIMIZATION A unified framework for sequential decisions , 2019 .

[81]  Craig Boutilier,et al.  Data center cooling using model-predictive control , 2018, NeurIPS.

[82]  Scott Kennedy,et al.  High energy prices , 2006 .