论文信息 - Reservoir Operation Optimization by Reinforcement Learning

Reservoir Operation Optimization by Reinforcement Learning

Plann ing of reservoir management and optimal operations of surface water resources has always been a critical and strategic concern of all governments. Today, many equipments, facilities, and substantial budgets have been assigned to carry out an optimal scheduling of water and energy resources over long or short periods. Many researchers have been working on these areas to improve the performance of such a system. They usually attempt to apply new mathematical and heuristic techniques to tackle a wide variety of complexities in real-world applications and especially large-scale problems. Stochasticity, nonlinearity/nonconvexity and dimensionality are the main sources of complexity. In other words, there are many techniques, which could circumvent these complexities via some kind of approximations in uncertain environments with complex and unknown relations between various system parameters. In fact, using different methods to optimize the operations of large-scale problems coming along with much unrealistic estimations makes the final solution very imprecise and usually too far from real optimal solution. Moreover, the existing limitations of hardware or software cause some important physical constraints, which prevent various relations between variables and parameters from being considered. In other words, even if all possible relations between parameters in a problem are known and definable, considering all of them simultaneously might make the problem very difficult to solve.

Kumaraswamy Ponnambalam | Hamid R. Tizhoosh | Masoud Mahootchi

[1] H. Robbins. A Stochastic Approximation Method , 1951 .

[2] Barry J. Adams,et al. Stochastic Optimization of Multi Reservoir Systems Using a Heuristic Algorithm: Case Study From India , 1996 .

[3] Robert E. Larson,et al. State increment dynamic programming , 1968 .

[4] André Turgeon. A decomposition method for the long-term scheduling of reservoirs in series , 1981 .

[5] John W. Labadie,et al. Optimal Operation of Multireservoir Systems: State-of-the-Art Review , 2004 .

[6] Andrew G. Barto,et al. Reinforcement learning , 1998 .

[7] Abhijit Gosavi,et al. Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning , 2003 .

[8] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.

[9] Arthur J. Askew,et al. Multilevel incremental dynamic programing , 1976 .

[10] Chris Watkins,et al. Learning from delayed rewards , 1989 .