Reinforcement learning in sustainable energy and electric systems: a survey

Abstract The dynamic nature of sustainable energy and electric systems can vary significantly along with the environment and load change, and they represent the features of multivariate, high complexity and uncertainty of the nonlinear system. Moreover, the integration of intermittent renewable energy sources and energy consumption behaviours of households introduce more uncertainty into sustainable energy and electric systems. The operation, control and decision-making in such an environment definitely require increasing intelligence and flexibility in the control and optimization to ensure the quality of service of sustainable energy and electric systems. Reinforcement learning is a wide class of optimal control strategies that uses estimating value functions from experience, simulation, or search to learn in highly dynamic, stochastic environment. The interactive context enables reinforcement learning to develop strong learning ability and high adaptability. Reinforcement learning does not require the use of the model of system dynamics, which makes it suitable for sustainable energy and electric systems with complex nonlinearity and uncertainty. The use of reinforcement learning in sustainable energy and electric systems will certainly change the traditional energy utilization mode and bring more intelligence into the system. In this survey, an overview of reinforcement learning, the demand for reinforcement learning in sustainable energy and electric systems, reinforcement learning applications in sustainable energy and electric systems, and future challenges and opportunities will be explicitly addressed.

[1]  Nand Kishor,et al.  Distributed Multi-Agent System-Based Load Frequency Control for Multi-Area Power System in Smart Grid , 2017, IEEE Transactions on Industrial Electronics.

[2]  Tao Yu,et al.  A reinforcement learning approach to power system stabilizer , 2009, 2009 IEEE Power & Energy Society General Meeting.

[3]  Ramtin Hadidi,et al.  Reinforcement Learning Based Real-Time Wide-Area Stabilizing Control Agents to Enhance Power System Stability , 2013, IEEE Transactions on Smart Grid.

[4]  Enrico Zio,et al.  A reinforcement learning framework for optimal operation and maintenance of power grids , 2019, Applied Energy.

[5]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[6]  R. Belmans,et al.  Reinforcement Learning Applied to an Electric Water Heater: From Theory to Practice , 2015, IEEE Transactions on Smart Grid.

[7]  Mevludin Glavic,et al.  Design of a resistive brake controller for power system stability enhancement using reinforcement learning , 2005, IEEE Transactions on Control Systems Technology.

[8]  Tao Yu,et al.  R(λ) imitation learning for automatic generation control of interconnected power grids , 2012, Autom..

[9]  Tao Yu,et al.  Hierarchically correlated equilibrium Q-learning for multi-area decentralized collaborative reactive power optimization , 2016 .

[10]  Victor C. M. Leung,et al.  Software-Defined Networks with Mobile Edge Computing and Caching for Smart Cities: A Big Data Deep Reinforcement Learning Approach , 2017, IEEE Communications Magazine.

[11]  Jinjun Xiong,et al.  A novel grid load management technique using electric water heaters and Q-learning , 2014, 2014 IEEE International Conference on Smart Grid Communications (SmartGridComm).

[12]  Michael L. Littman,et al.  Reinforcement learning improves behaviour from evaluative feedback , 2015, Nature.

[13]  Ying Chen,et al.  Evaluation of Reinforcement Learning-Based False Data Injection Attack to Automatic Voltage Control , 2019, IEEE Transactions on Smart Grid.

[14]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[15]  Kw Chan,et al.  Q-learning based dynamic optimal CPS control methodology for interconnected power systems , 2009 .

[16]  Yuan Zou,et al.  Reinforcement learning-based real-time energy management for a hybrid tracked vehicle , 2016 .

[17]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[18]  Huiru Zhao,et al.  Application of a Gradient Descent Continuous Actor-Critic Algorithm for Double-Side Day-Ahead Electricity Market Modeling , 2016 .

[19]  Gong Li,et al.  Agent-based modeling for trading wind power with uncertainty in the day-ahead wholesale electricity markets of single-sided auctions , 2012 .

[20]  Chong Li,et al.  Online Cyber-Attack Detection in Smart Grid: A Reinforcement Learning Approach , 2018, IEEE Transactions on Smart Grid.

[21]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[22]  Adeniyi A. Babalola,et al.  Reinforcement learning approach for congestion management and cascading failure prevention with experimental application , 2016 .

[23]  Junwei Cao,et al.  Optimal energy management strategies for energy Internet via deep reinforcement learning approach , 2019, Applied Energy.

[24]  G. Burt,et al.  Comparing Policy Gradient and Value Function Based Reinforcement Learning Methods in Simulated Electrical Power Trade , 2012, IEEE Transactions on Power Systems.

[25]  Jinjun Xiong,et al.  Demand-Side Management of Domestic Electric Water Heaters Using Approximate Dynamic Programming , 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[26]  Goran Strbac,et al.  Deep Reinforcement Learning for Strategic Bidding in Electricity Markets , 2020, IEEE Transactions on Smart Grid.

[27]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[28]  Andrej F. Gubina,et al.  The Advanced Bidding Strategy for Power Generators Based on Reinforcement Learning , 2014 .

[29]  Jianhua Li,et al.  Big Data Analysis-Based Security Situational Awareness for Smart Grid , 2018, IEEE Transactions on Big Data.

[30]  Yao Zhang,et al.  An adaptive HVDC supplementary damping controller based on reinforcement learning , 2006 .

[31]  Adriana Chis,et al.  Reinforcement Learning-Based Plug-in Electric Vehicle Charging With Forecasted Price , 2017, IEEE Transactions on Vehicular Technology.

[32]  Mariesa L. Crow,et al.  Heterogeneous Energy Storage Optimization for Microgrids , 2016, IEEE Transactions on Smart Grid.

[33]  Yue Tan,et al.  Deep Reinforcement Learning for Autonomous Internet of Things: Model, Applications and Challenges , 2019, IEEE Communications Surveys & Tutorials.

[34]  Patrick M. Pilarski,et al.  Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).

[35]  Antonio Liotta,et al.  On-Line Building Energy Optimization Using Deep Reinforcement Learning , 2017, IEEE Transactions on Smart Grid.

[36]  Lucian Busoniu,et al.  Reinforcement learning for control: Performance, stability, and deep approximators , 2018, Annu. Rev. Control..

[37]  D. Ernst,et al.  Power systems stability control: reinforcement learning framework , 2004, IEEE Transactions on Power Systems.

[38]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[39]  Jie Zhang,et al.  Reinforced Deterministic and Probabilistic Load Forecasting via $Q$ -Learning Dynamic Model Selection , 2020, IEEE Transactions on Smart Grid.

[40]  Wanlu Zhang,et al.  Reactive Power Optimization for Transient Voltage Stability in Energy Internet via Deep Reinforcement Learning Approach , 2019 .

[41]  Ali Mohammad Ranjbar,et al.  Optimising operational cost of a smart energy hub, the reinforcement learning approach , 2015, Int. J. Parallel Emergent Distributed Syst..

[42]  Zhiyong Huang,et al.  Optimal Planning of Communication System of CPS for Distribution Network , 2017, J. Sensors.

[43]  Habib Rajabi Mashhadi,et al.  An Adaptive $Q$-Learning Algorithm Developed for Agent-Based Computational Modeling of Electricity Market , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[44]  Jiayi Cao,et al.  Reinforcement learning-based real-time power management for hybrid energy storage system in the plug-in hybrid electric vehicle , 2018 .

[45]  Louis Wehenkel,et al.  Trajectory-Based Supplementary Damping Control for Power System Electromechanical Oscillations , 2014, IEEE Transactions on Power Systems.

[46]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[47]  Wei Tian,et al.  A Model Combining Stacked Auto Encoder and Back Propagation Algorithm for Short-Term Wind Power Forecasting , 2018, IEEE Access.

[48]  Muhammad Babar,et al.  Online scheduling of plug-in vehicles in dynamic pricing schemes , 2016 .

[49]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[50]  Andrea Lockerd Thomaz,et al.  Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..

[51]  George A. Vouros,et al.  Fuzzy Q-Learning for multi-agent decentralized energy management in microgrids , 2018, Applied Energy.

[52]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[53]  R.G. Harley,et al.  Adaptive Critic Design Based Neuro-Fuzzy Controller for a Static Compensator in a Multimachine Power System , 2006, 2007 IEEE Power Engineering Society General Meeting.

[54]  Dirk Vanhoudt,et al.  Model-Free Control of Thermostatically Controlled Loads Connected to a District Heating Network , 2017, ArXiv.

[55]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[56]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[57]  Danial Esmaeili Aliabadi,et al.  Competition, risk and learning in electricity markets: An agent-based simulation study , 2017 .

[58]  Lalit Chandra Saikia,et al.  Automatic generation control of a multi area hydrothermal system using reinforced learning neural network controller , 2011 .

[59]  Huaguang Zhang,et al.  Real-Time Energy Management of a Microgrid Using Deep Reinforcement Learning , 2019, Energies.

[60]  Ibraheem Nasiruddin,et al.  Modeling of HVDC Tie Links and Their Utilization in AGC/LFC Operations of Multiarea Power Systems , 2019, IEEE Transactions on Industrial Electronics.

[61]  Biao Huang,et al.  A Long-Short Term Memory Recurrent Neural Network Based Reinforcement Learning Controller for Office Heating Ventilation and Air Conditioning Systems , 2017 .

[62]  Leslie K. Norford,et al.  Optimal control of HVAC and window systems for natural ventilation through reinforcement learning , 2018, Energy and Buildings.

[63]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[64]  T. Dragičević,et al.  Bidding strategy for trading wind energy and purchasing reserve of wind power producer – A DRL based approach , 2020 .

[65]  Hak-Man Kim,et al.  Double Deep $Q$ -Learning-Based Distributed Operation of Battery Energy Storage System Considering Uncertainties , 2020, IEEE Transactions on Smart Grid.

[66]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[67]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[68]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[69]  Wei Qiao,et al.  An Adaptive Network-Based Reinforcement Learning Method for MPPT Control of PMSG Wind Energy Conversion Systems , 2016, IEEE Transactions on Power Electronics.

[70]  Mihaela van der Schaar,et al.  Dynamic Pricing and Energy Consumption Scheduling With Reinforcement Learning , 2016, IEEE Transactions on Smart Grid.

[71]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[72]  Mevludin Glavic,et al.  (Deep) Reinforcement learning for electric power system control and related problems: A short review and perspectives , 2019, Annu. Rev. Control..

[73]  Haipeng Yao,et al.  A novel QoS-enabled load scheduling algorithm based on reinforcement learning in software-defined energy internet , 2019, Future Gener. Comput. Syst..

[74]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[75]  Haibo He,et al.  Intelligent load frequency controller using GrADP for island smart grid with electric vehicles and renewable resources , 2015, Neurocomputing.

[76]  Zhe Zhang,et al.  Reinforcement-Learning-Based Intelligent Maximum Power Point Tracking Control for Wind Energy Conversion Systems , 2015, IEEE Transactions on Industrial Electronics.

[77]  Yuan Zou,et al.  Reinforcement Learning of Adaptive Energy Management With Transition Probability for a Hybrid Electric Tracked Vehicle , 2015, IEEE Transactions on Industrial Electronics.

[78]  Mohsen Guizani,et al.  Semisupervised Deep Reinforcement Learning in Support of IoT and Smart City Services , 2018, IEEE Internet of Things Journal.

[79]  Seung Ho Hong,et al.  A Dynamic pricing demand response algorithm for smart grid: Reinforcement learning approach , 2018, Applied Energy.

[80]  Hongwen He,et al.  Continuous reinforcement learning of energy management with deep Q network for a power split hybrid electric bus , 2018, Applied Energy.

[81]  Fangxing Li,et al.  Intelligent Multi-Microgrid Energy Management Based on Deep Neural Network and Model-Free Reinforcement Learning , 2020, IEEE Transactions on Smart Grid.

[82]  Hao Liang,et al.  Distributed Economic Dispatch in Microgrids Based on Cooperative Reinforcement Learning , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[83]  Goran Strbac,et al.  Multi-Period and Multi-Spatial Equilibrium Analysis in Imperfect Electricity Markets: A Novel Multi-Agent Deep Reinforcement Learning Approach , 2019, IEEE Access.

[84]  Frank L. Lewis,et al.  Reinforcement learning and optimal adaptive control: An overview and implementation examples , 2012, Annu. Rev. Control..

[85]  Mohammad Bagher Menhaj,et al.  A Multi-agent-based voltage control in power systems using distributed reinforcement learning , 2011, Simul..

[86]  Ratnesh K. Sharma,et al.  Dynamic Energy Management System for a Smart Microgrid , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[87]  Haibo He,et al.  Cyber-Attack Recovery Strategy for Smart Grid Based on Deep Reinforcement Learning , 2020, IEEE Transactions on Smart Grid.

[88]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[89]  Hanchen Xu,et al.  Deep Reinforcement Learning for Joint Bidding and Pricing of Load Serving Entity , 2019, IEEE Transactions on Smart Grid.

[90]  Guoyuan Wu,et al.  Deep reinforcement learning enabled self-learning control for energy efficient driving , 2019, Transportation Research Part C: Emerging Technologies.

[91]  Tao Yu,et al.  Design of a Novel Smart Generation Controller Based on Deep Q Learning for Large-Scale Interconnected Power System , 2018 .

[92]  Louis Wehenkel,et al.  Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[93]  Bo Yang,et al.  Accelerating bio-inspired optimizer with transfer reinforcement learning for reactive power optimization , 2017, Knowl. Based Syst..

[94]  Kwangyeol Ryu,et al.  Reinforcement learning approach to goal-regulation in a self-evolutionary manufacturing system , 2012, Expert Syst. Appl..

[95]  Zhen Ni,et al.  A Multistage Game in Smart Grid Security: A Reinforcement Learning Solution , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[96]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[97]  Haibo He,et al.  Model-Free Real-Time EV Charging Scheduling Based on Deep Reinforcement Learning , 2019, IEEE Transactions on Smart Grid.

[98]  Sukumar Kamalasadan,et al.  Design and Real-Time Implementation of Optimal Power System Wide-Area System-Centric Controller Based on Temporal Difference Learning , 2016 .

[99]  Nando de Freitas,et al.  Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.

[100]  Guoyuan Wu,et al.  Data-Driven Reinforcement Learning–Based Real-Time Energy Management System for Plug-In Hybrid Electric Vehicles , 2016 .

[101]  Seung Ho Hong,et al.  Demand Response for Home Energy Management Using Reinforcement Learning and Artificial Neural Network , 2019, IEEE Transactions on Smart Grid.

[102]  Heejo Lee,et al.  This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. INVITED PAPER Cyber–Physical Security of a Smart Grid Infrastructure , 2022 .

[103]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[104]  Farzan Rashidi,et al.  Damping enhancement in the presence of load parameters uncertainty using reinforcement learning based SVC controller , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).

[105]  Yimin Zhou,et al.  Smart generation control based on multi-agent reinforcement learning with the idea of the time tunnel , 2017, Energy.

[106]  Haibo He,et al.  Q-Learning-Based Vulnerability Analysis of Smart Grid Against Sequential Topology Attacks , 2017, IEEE Transactions on Information Forensics and Security.

[107]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[108]  Hado van Hasselt,et al.  Double Q-learning , 2010, NIPS.

[109]  T. Y. Ji,et al.  Multiple agents and reinforcement learning for modelling charging loads of electric taxis , 2018, Applied Energy.

[110]  Sohrab Asgarpoor,et al.  Reinforcement Learning Approach for Optimal Distributed Energy Management in a Microgrid , 2018, IEEE Transactions on Power Systems.

[111]  Ali Mohammad Ranjbar,et al.  Demand side management for a residential customer in multi-energy systems , 2016 .

[112]  D. Ernst,et al.  Combining a stability and a performance-oriented control in power systems , 2005, IEEE Transactions on Power Systems.

[113]  Peter B. Luh,et al.  Event-Based Optimization Within the Lagrangian Relaxation Framework for Energy Savings in HVAC Systems , 2015, IEEE Transactions on Automation Science and Engineering.

[114]  Zhe Chen,et al.  Steady-state analysis of the integrated natural gas and electric power system with bi-directional energy conversion , 2016 .

[115]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[116]  Hongjie Jia,et al.  Optimal day-ahead scheduling of integrated urban energy systems , 2016 .

[117]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[118]  Xiaoxin Zhou,et al.  Learning-coordinate fuzzy logic control of dynamic quadrature boosters in multi-machine power systems , 1999 .

[119]  Wencong Su,et al.  Indirect Customer-to-Customer Energy Trading With Reinforcement Learning , 2019, IEEE Transactions on Smart Grid.

[120]  George A. Vouros,et al.  A reinforcement learning approach for MPPT control method of photovoltaic sources , 2017 .

[121]  Wen-Yen Chen,et al.  A Reinforcement Learning-Based Maximum Power Point Tracking Method for Photovoltaic Array , 2015 .

[122]  Tom Holvoet,et al.  Reinforcement Learning of Heuristic EV Fleet Charging in a Day-Ahead Electricity Market , 2015, IEEE Transactions on Smart Grid.

[123]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[124]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[125]  Hou Zhi-jian Strategic Bidding of the Electricity Producers Based on the Reinforcement Learning , 2006 .