Short term management of hydro-power system using reinforcement learning

The fundamental objective in operation of reservoir complex is to specify an optimal decision policy so that it can maximize the expected value of reward function over the planning horizon. This control problem becomes more challenging as a result of existing different sources of uncertainties that reservoir planner needs to deal with. Usually, a trade-off exists between a value of water in storage and the electricity production. The function on the side of the value of water is uncertain and nonlinear in the reservoir management problem and it heavily depends on storage of reservoir and storage of other reservoirs as well. The challenging task is then how to solve this large-scale multireservoir problem under the presence of several uncertainties. In this thesis, the integration of a novel approach known as Reinforcement Learning (RL) is presented in order to provide an efficient optimization of a large-scale hydroelectric power system. RL is a branch of artificial intelligence method that presents several key benefits in treating problems that are too large to be handled by traditional dynamic programming techniques. In this approach, an agent tries to learn the optimal decision continuously so as to maximize the reward function based on interacting with the environment. This study presents the major concepts and computational aspects of using RL for the short-term planning problem of multireservoir system. The developed reinforcement learning based optimization model was successfully implemented on the Hydro-Quebec multireservoir complex located at the Riviere Romaine, north of the municipality of Havre-Saint-Pierre on the north shore of the St. Lawrence. This model was subsequently used to obtain optimal water release policies for the previously-mentioned reservoir complex. The output of the designed model was compared to the conventional optimization methods known as deterministic dynamic programming. The results show that the RL model is much more efficient and reliable in solving large-scale reservoir operations problems and can give a very good approximate solution to this complex problem.

[1]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[2]  M. Karamouz,et al.  Water allocation improvement in river basin using Adaptive Neural Fuzzy Reinforcement Learning approach , 2007, Appl. Soft Comput..

[3]  Sharon A. Johnson,et al.  The Value of Hydrologic Information in Stochastic Dynamic Programming Models of a Multireservoir System , 1995 .

[4]  Paresh Chandra Deka,et al.  Fuzzy Neural Network Modeling of Reservoir Operation , 2009 .

[5]  G. K. Young Finding Reservoir Operating Rules , 1967 .

[6]  Teemu Pennanen,et al.  Integration quadratures in discretization of stochastic programs , 2002 .

[7]  R Bellman,et al.  On the Theory of Dynamic Programming. , 1952, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Samuel O. Russell,et al.  Reservoir Operating Rules with Fuzzy Programming , 1996 .

[9]  Bernard F. Lamond Stochastic Optimization Of A Hydroelectric Reservoir Using Piecewise Polynomial Approximations , 2003 .

[10]  Eduardo F. Morales,et al.  Multi-objective optimization of water-using systems , 2007, Eur. J. Oper. Res..

[11]  A. Turgeon,et al.  Fuzzy Learning Decomposition for the Scheduling of Hydroelectric Power Systems , 1996 .

[12]  Ahmed El-Shafie,et al.  Integrated Artificial Neural Network (ANN) and Stochastic Dynamic Programming (SDP) Model for Optimal Release Policy , 2013, Water Resources Management.

[13]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[14]  R. Wardlaw,et al.  EVALUATION OF GENETIC ALGORITHMS FOR OPTIMAL RESERVOIR SYSTEM OPERATION , 1999 .

[15]  A. Turgeon Optimal operation of multireservoir power systems with stochastic inflows , 1980 .

[16]  A. Burcu Altan-Sakarya,et al.  Optimization of Multireservoir Systems by Genetic Algorithm , 2011 .

[17]  Frank T.-C. Tsai,et al.  Optimization of Large-Scale Hydropower System Operations , 2003 .

[18]  Dimitri P. Solomatine,et al.  Neural networks and reinforcement learning in control of water systems , 2003 .

[19]  Mark H. Houck,et al.  Real‐Time Reservoir Operations by Goal Programming , 1984 .

[20]  Marcello Restelli,et al.  Tree‐based reinforcement learning for optimal water reservoir operation , 2010 .

[21]  Toby Walsh,et al.  Stochastic Constraint Programming: A Scenario-Based Approach , 2009, Constraints.

[22]  A. Turgeon An application of parametric mixed‐integer linear programming to hydropower development , 1987 .

[23]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[24]  Paulo Chaves,et al.  Stochastic Fuzzy Neural Network: Case Study of Optimal Reservoir Operation , 2007 .

[25]  M. V. F. Pereira,et al.  Multi-stage stochastic optimization applied to energy planning , 1991, Math. Program..

[26]  John W. Labadie,et al.  Optimal Operation of Multireservoir Systems: State-of-the-Art Review , 2004 .

[27]  Werner Römisch,et al.  Scenario Reduction Algorithms in Stochastic Programming , 2003, Comput. Optim. Appl..

[28]  William W.-G. Yeh,et al.  Reservoir Management and Operations Models: A State‐of‐the‐Art Review , 1985 .

[29]  Sharon A. Johnson,et al.  Comparison of two approaches for implementing multireservoir operating policies derived using stochastic dynamic programming , 1993 .

[30]  André Turgeon,et al.  Solving a stochastic reservoir management problem with multilag autocorrelated inflows , 2005 .

[31]  David Q. Mayne,et al.  Differential dynamic programming , 1972, The Mathematical Gazette.

[32]  Z. K. Shawwash,et al.  The B.C. Hydro short term hydro scheduling optimization model , 1999 .

[33]  Taesoon Kim,et al.  Inflow Forecasting for Real-Time Reservoir Operation Using Artificial Neural Network , 2009 .

[34]  Deepti Rani,et al.  Simulation–Optimization Modeling: A Survey and Potential Application in Reservoir Systems Operation , 2010 .

[35]  K. D. W. Nandalal,et al.  Dynamic Programming Based Operation of Reservoirs: Applicability and Limits , 2007 .

[36]  Ching-Gung Wen,et al.  A neural network approach to multiobjective optimization for water quality management in a river basin , 1998 .

[37]  M. Piekutowski,et al.  Optimal short-term scheduling for a large-scale cascaded hydro system , 1993 .

[38]  Ying Li,et al.  Numerical Solution of Continuous-State Dynamic Programs Using Linear and Spline Interpolation , 1993, Oper. Res..

[39]  Mohammad Karamouz,et al.  Fuzzy-State Stochastic Dynamic Programming for Reservoir Operation , 2004 .

[40]  Jery R. Stedinger,et al.  Reservoir optimization using sampling SDP with ensemble streamflow prediction (ESP) forecasts , 2001 .

[41]  Robert Leconte,et al.  Comparison of Stochastic Optimization Algorithms for Hydropower Reservoir Operation with Ensemble Streamflow Prediction , 2016 .

[42]  Faridah Othman,et al.  Developing Optimal Reservoir Operation for Multiple and Multipurpose Reservoirs Using Mathematical Programming , 2015 .

[43]  J. Stedinger,et al.  Sampling stochastic dynamic programming applied to reservoir operation , 1990 .

[44]  Jin-Hee Lee,et al.  Stochastic optimization of multireservoir systems via reinforcement learning , 2007 .

[45]  J. Wayland Eheart,et al.  Evaluation of Neural Networks for Modeling Nitrate Concentrations in Rivers , 2003 .

[46]  Fakhri Karray,et al.  Reservoir Operation Using a Dynamic Programming Fuzzy Rule–Based Approach , 2005 .

[47]  Charles Audet,et al.  Stochastic short-term hydropower planning with inflow scenario trees , 2017, Eur. J. Oper. Res..

[48]  Andrea Castelletti,et al.  Reinforcement learning in the operational management of a water system , 2002 .

[49]  G. V. Loganathan,et al.  Goal-Programming Techniques for Optimal Reservoir Operations , 1990 .

[50]  Mario Ventresca,et al.  Oppositional Concepts in Computational Intelligence , 2008, Oppositional Concepts in Computational Intelligence.

[51]  John N. Tsitsiklis,et al.  Neuro-dynamic programming: an overview , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.

[52]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[53]  Warren B. Powell,et al.  Approximate Dynamic Programming I: Modeling , 2011 .

[54]  Gerard L. Doorman,et al.  Evaluation of scenario reduction methods for stochastic inflow in hydro scheduling models , 2015, 2015 IEEE Eindhoven PowerTech.

[55]  S. Yakowitz,et al.  Constrained differential dynamic programming and its application to multireservoir control , 1979 .

[56]  L. F. R. Reis,et al.  Multi-Reservoir Operation Planning Using Hybrid Genetic Algorithm and Linear Programming (GA-LP): An Alternative Stochastic Approach , 2005 .

[57]  Robert E. Ll Dynamic Programming with Reduced Computational Requirements , 1965 .

[58]  John W. Labadie,et al.  Optimal Operational Analysis of the Colorado-Big Thompson Project , 1989 .

[59]  Uday S. Dixit,et al.  Application of soft computing techniques in machining performance prediction and optimization: a literature review , 2010 .

[60]  Maarouf Saad,et al.  Censored‐data correlation and principal component dynamic programming , 1992 .

[61]  J. J. Bogardi,et al.  Testing Stochastic Dynamic Programming Models Conditioned on Observed or Forecasted Inflows , 1991 .