Real-time energy purchase optimization for a storage-integrated photovoltaic system by deep reinforcement learning

Abstract The objective of this article is to minimize the cost of energy purchased on a real-time basis for a storage-integrated photovoltaic (PV) system installed in a microgrid. Under non-linear storage charging/discharging characteristics, as well as uncertain solar energy generation, demands, and market prices, it is a complex task. It requires a proper level of tradeoff between storing too much and too little energy in the battery: future excess PV energy is lost in the former case, and demand is exposed to future high electricity prices in the latter case. We propose a reinforcement learning approach to deal with a non-stationary environment and non-linear storage characteristics. To make this approach applicable, a novel formulation of the decision problem is presented, which focuses on the optimization of grid energy purchases rather than on direct storage control. This limits the complexity of the state and action space, making it possible to achieve satisfactory learning speed and avoid stability issues. Then the Q-learning algorithm combined with a dense deep neural network for function representation is used to learn an optimal decision policy. The algorithm incorporates enhancements that were found to improve learning speed and stability by prior work, such as experience replay, target network, and increasing discount factor. Extensive simulation results performed on real data confirm that our approach is effective and outperforms rule-based heuristics.

[1]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[2]  Gevork B. Gharehpetian,et al.  Review on Energy Storage Systems Control Methods in Microgrids , 2019, International Journal of Electrical Power & Energy Systems.

[3]  C. Watkins Learning from delayed rewards , 1989 .

[4]  Chee Lim Nge,et al.  A real-time energy management system for smart grid integrated photovoltaic generation with battery storage , 2019, Renewable Energy.

[5]  Richard S. Sutton,et al.  Weighted importance sampling for off-policy learning with linear function approximation , 2014, NIPS.

[6]  Damien Ernst,et al.  How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies , 2015, ArXiv.

[7]  Federico Silvestro,et al.  Day-Ahead Planning and Real-Time Control of Integrated PV-Storage Systems by Stochastic Optimization , 2017 .

[8]  Xiaoqing Han,et al.  Review on the research and practice of deep learning and reinforcement learning in smart grids , 2018, CSEE Journal of Power and Energy Systems.

[9]  Peng Zhang,et al.  Deep Q-Learning with Prioritized Sampling , 2016, ICONIP.

[10]  Massoud Pedram,et al.  A Near-Optimal Model-Based Control Algorithm for Households Equipped With Residential Photovoltaic Power Generation and Energy Storage Systems , 2016, IEEE Transactions on Sustainable Energy.

[11]  R. Bellman Dynamic programming. , 1957, Science.

[12]  Wei Zhang,et al.  A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[13]  Damien Ernst,et al.  Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives , 2017 .

[14]  Andoni Urtasun,et al.  State-of-charge-based droop control for stand-alone AC supply systems with distributed energy storage , 2015 .

[15]  Gianluca Bontempi,et al.  Improving the Exploration Strategy in Bandit Algorithms , 2008, LION.

[16]  Pawel Cichosz,et al.  An Analysis of Experience Replay in Temporal Difference Learning , 1999, Cybern. Syst..

[17]  Leemon C. Baird,et al.  Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  R. P. Saini,et al.  A review on Integrated Renewable Energy System based power generation for stand-alone applications: Configurations, storage options, sizing methodologies and control , 2014 .

[20]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[21]  Zhuoran Yang,et al.  A Theoretical Analysis of Deep Q-Learning , 2019, L4DC.

[22]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[23]  Giovanna Cavazzini,et al.  A PSO (particle swarm optimization)-based model for the optimal management of a small PV(Photovoltaic)-pump hydro energy storage in a rural dry area , 2014 .

[24]  Michael Schukat,et al.  Deep Reinforcement Learning: An Overview , 2016, IntelliSys.

[25]  Pawel Cichosz,et al.  Truncating Temporal Differences: On the Efficient Implementation of TD(lambda) for Reinforcement Learning , 1994, J. Artif. Intell. Res..

[26]  Ning Lu,et al.  A Supervised Machine Learning Approach to Control Energy Storage Devices , 2019, IEEE Transactions on Smart Grid.

[27]  Sebastian Thrun,et al.  Issues in Using Function Approximation for Reinforcement Learning , 1999 .

[28]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[29]  Qinglai Wei,et al.  Adaptive Dynamic Programming-Based Optimal Control Scheme for Energy Storage Systems With Solar Renewable Energy , 2017, IEEE Transactions on Industrial Electronics.

[30]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[31]  Andrew W. Moore,et al.  Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.

[32]  Andrew G. Barto,et al.  On the Computational Economics of Reinforcement Learning , 1991 .

[33]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[34]  Maria Domenica Di Benedetto,et al.  Power management for a DC MicroGrid integrating renewables and storages , 2019, Control Engineering Practice.

[35]  Rodolfo Dufo-López,et al.  Optimisation of size and control of grid-connected storage under real time electricity pricing conditions , 2015 .

[36]  Charles W. Anderson,et al.  Strategy Learning with Multilayer Connectionist Representations , 1987 .

[37]  Laurence T. Yang,et al.  Energy-Efficient Scheduling for Real-Time Systems Based on Deep Q-Learning Model , 2019, IEEE Transactions on Sustainable Computing.

[38]  Hongxing Yang,et al.  Overview on hybrid solar photovoltaic-electrical energy storage technologies for power supply to buildings , 2019, Energy Conversion and Management.

[39]  Harm van Seijen,et al.  Effective Multi-step Temporal-Difference Learning for Non-Linear Function Approximation , 2016, ArXiv.

[40]  Quang An Phan,et al.  Determination of optimal battery utilization to minimize operating costs for a grid-connected building with renewable energy sources , 2018, Energy Conversion and Management.

[41]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[42]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[43]  Derong Liu,et al.  Optimal self-learning battery control in smart residential grids by iterative Q-learning algorithm , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[44]  Gerald Tesauro,et al.  Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..

[45]  Michael A. Danzer,et al.  Optimal charge control strategies for stationary photovoltaic battery systems , 2014 .

[46]  Phuong H. Nguyen,et al.  Robust optimisation for deciding on real-time flexibility of storage-integrated photovoltaic units controlled by intelligent software agents , 2017 .

[47]  Wai Lok Woo,et al.  Intelligent Controller for Energy Storage System in Grid-Connected Microgrid , 2021, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[48]  Richard S. Sutton,et al.  A Deeper Look at Experience Replay , 2017, ArXiv.

[49]  Seung Ho Hong,et al.  A Dynamic pricing demand response algorithm for smart grid: Reinforcement learning approach , 2018, Applied Energy.

[50]  Dae-Hyun Choi,et al.  Reinforcement Learning-Based Energy Management of Smart Home with Rooftop Solar Photovoltaic System, Energy Storage System, and Home Appliances , 2019, Sensors.

[51]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.