Multilayer perception based reinforcement learning supervisory control of energy systems with application to a nuclear steam supply system

Energy system optimization is important in strengthening stability, reliability and economy, which is usually given by static linear or nonlinear programming. However, the challenge faced in real-life currently is how to give the optimization by taking naturally existed energy system dynamics into account. To face this challenge, a multi-layer perception (MLP) based reinforcement learning control (RLC) method is proposed for the nonlinear dissipative system coupled by an arbitrary energy system and its local controllers, which can be able to optimize a given performance index dynamically and effectively without the accurate knowledge of system dynamics. This MLP-based RLC is composed of a MLP-based state-observer and an approximated optimal controller. The MLP-based state-observer is given for identification, which converges to a bounded neighborhood of the system dynamics asymptotically. The approximated optimal controller is determined by solving an algebraic Riccati equation with parameters given by the MLP-based state-observer. Based on Lyapunov direct method, it is further proven that the closed-loop is uniformly ultimately bounded stable. Finally, this newly-built MLP-based RLC is applied to the supervisory optimization of thermal power response for a nuclear steam supply system, and simulation results show not only the satisfactory performance but also the influences from the controller parameters to closed-loop responses.

[1]  Shuo Zhang,et al.  Pontryagin’s Minimum Principle-based power management of a dual-motor-driven electric bus , 2015 .

[2]  Ali Haseltalab,et al.  Model predictive maneuvering control and energy management for all-electric autonomous ships , 2019, Applied Energy.

[3]  Bing Dong,et al.  Model predictive control for building loads connected with a residential distribution grid , 2018, Applied Energy.

[4]  Zuoyi Zhang,et al.  Economic potential of modular reactor nuclear power plants based on the Chinese HTR-PM project , 2007 .

[5]  Tao Zhang,et al.  Stable Adaptive Neural Network Control , 2001, The Springer International Series on Asian Studies in Computer and Information Science.

[6]  Peng Li,et al.  Model predictive control based robust scheduling of community integrated energy system with operational flexibility , 2019, Applied Energy.

[7]  Hong Wang,et al.  The Shandong Shidao Bay 200 MW e High-Temperature Gas-Cooled Reactor Pebble-Bed Module (HTR-PM) Demonstration Power Plant: An Engineering and Technological Innovation , 2016 .

[8]  Frank L. Lewis,et al.  Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2010, Autom..

[9]  D. Müller,et al.  Application of the second law of thermodynamics to control: A review , 2019, Energy.

[10]  Josep M. Guerrero,et al.  A model predictive control strategy of PV-Battery microgrid under variable power generations and load conditions , 2018, Applied Energy.

[11]  Meihong Wang,et al.  Reinforced coordinated control of coal-fired power plant retrofitted with solvent based CO2 capture using model predictive controls , 2019, Applied Energy.

[12]  Shengwei Wang,et al.  Model predictive control for thermal energy storage and thermal comfort optimization of building demand response in smart grids , 2019, Applied Energy.

[13]  Simona Onori,et al.  Adaptive Pontryagin’s Minimum Principle supervisory controller design for the plug-in hybrid GM Chevrolet Volt , 2015 .

[14]  Marian Trafczynski,et al.  Robust model predictive control and PID control of shell-and-tube heat exchangers , 2018, Energy.

[15]  Yoon Joon Lee,et al.  Design of a fuzzy model predictive power controller for pressurized water reactors , 2006 .

[16]  Yan Xu,et al.  Data-Driven Load Frequency Control for Stochastic Power Systems: A Deep Reinforcement Learning Method With Continuous Action Search , 2019, IEEE Transactions on Power Systems.

[17]  Jonghoon Ahn,et al.  Energy cost analysis of an intelligent building network adopting heat trading concept in a district heating model , 2018 .

[18]  A. Laub,et al.  Generalized eigenproblem algorithms and software for algebraic Riccati equations , 1984, Proceedings of the IEEE.

[19]  Andrés Etchepareborda,et al.  Research reactor power controller design using an output feedback nonlinear receding horizon control method , 2007 .

[20]  Daniel E. Quevedo,et al.  DeepCAS: A Deep Reinforcement Learning Algorithm for Control-Aware Scheduling , 2018, IEEE Control Systems Letters.

[21]  Frank L. Lewis,et al.  Optimized Assistive Human–Robot Interaction Using Reinforcement Learning , 2016, IEEE Transactions on Cybernetics.

[22]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[23]  Warren E. Dixon,et al.  Model-based reinforcement learning for approximate optimal regulation , 2016, Autom..

[24]  Xiaojing Huang,et al.  Dynamic Modeling and Control Characteristics of the Two-Modular HTR-PM Nuclear Plant , 2017 .

[25]  Monish D. Tandale,et al.  Improved Adaptive–Reinforcement Learning Control for Morphing Unmanned Air Vehicles , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[26]  Lei Yang,et al.  Reinforcement learning for optimal control of low exergy buildings , 2015 .

[27]  Mohammad Bagher Menhaj,et al.  Robust nonlinear model predictive control for a PWR nuclear power plant , 2012 .

[28]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[29]  Michael Wetter,et al.  Practical factors of envelope model setup and their effects on the performance of model predictive control for building heating, ventilating, and air conditioning systems , 2019, Applied Energy.

[30]  Damien Picard,et al.  Approximate model predictive building control via machine learning , 2018 .

[31]  Ahmed Cheriti,et al.  Combination of Markov chain and optimal control solved by Pontryagin’s Minimum Principle for a fuel cell/supercapacitor vehicle , 2015 .

[32]  Pablo S. Rivadeneira,et al.  Optimal supervisory control of steam generators operating in parallel , 2015 .

[33]  Warren E. Dixon,et al.  Model-based reinforcement learning for infinite-horizon approximate optimal tracking , 2014, 53rd IEEE Conference on Decision and Control.

[34]  Luisa F. Cabeza,et al.  Model Predictive Control Strategy Applied to Different Types of Building for Space Heating , 2018, Thermal Energy Storage with Phase Change Materials.

[35]  Xiaosong Hu,et al.  Pontryagin’s Minimum Principle based model predictive control of energy management for a plug-in hybrid electric bus , 2019, Applied Energy.

[36]  M. Ouyang,et al.  Approximate Pontryagin’s minimum principle applied to the energy management of plug-in hybrid electric vehicles , 2014 .

[37]  Yang Shi,et al.  H2-optimal transactive control of electric power regulation from fast-acting demand response in the presence of high renewables , 2017 .

[38]  Yujie Dong,et al.  Model-free adaptive control law for nuclear superheated-steam supply systems , 2017 .

[39]  Z. Dong,et al.  Practical dynamic matrix control of MHTGR-based nuclear steam supply systems , 2019, Energy.

[40]  Frank L. Lewis,et al.  A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems , 2013, Autom..

[41]  Guolong Chen,et al.  Multilayer Obstacle-Avoiding X-Architecture Steiner Minimal Tree Construction Based on Particle Swarm Optimization , 2015, IEEE Transactions on Cybernetics.

[42]  W Wim Zeiler,et al.  Economic model predictive control for demand flexibility of a residential building , 2019, Energy.

[43]  Javier Rosero Garcia,et al.  An affine arithmetic-model predictive control approach for optimal economic dispatch of combined heat and power microgrids , 2019, Applied Energy.

[44]  Derong Liu,et al.  Decentralized Stabilization for a Class of Continuous-Time Nonlinear Interconnected Systems Using Online Learning Optimal Control Approach , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[45]  A. Laub A schur method for solving algebraic Riccati equations , 1978, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.

[46]  Baher Abdulhai,et al.  Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto , 2013, IEEE Transactions on Intelligent Transportation Systems.

[47]  Yujie Dong,et al.  Multi-layer perception based model predictive control for the thermal power of nuclear superheated-steam supply systems , 2018 .

[48]  Pedro J. Mago,et al.  Real time optimal control of district cooling system with thermal energy storage using neural networks , 2019, Applied Energy.

[49]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[50]  Derong Liu,et al.  Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints , 2015, IEEE Transactions on Cybernetics.

[51]  Andrew G. Alleyne,et al.  Exergy-based optimal control of a vapor compression system , 2015 .

[52]  Robert Babuska,et al.  Reinforcement Learning for Port-Hamiltonian Systems , 2012, IEEE Transactions on Cybernetics.

[53]  Man Gyun Na,et al.  A Model Predictive Controller for Nuclear Reactor Power , 2003 .

[54]  G. H. Lohnert,et al.  Technical design features and essentiaL safety-related properties of the HTR-module , 1990 .

[55]  Antonio Vicino,et al.  An integrated model predictive control approach for optimal HVAC and energy storage operation in large-scale buildings , 2019, Applied Energy.

[56]  Panagiota Karava,et al.  A model predictive control strategy to optimize the performance of radiant floor heating and cooling systems in office buildings , 2019, Applied Energy.

[57]  Michel Dambrine,et al.  Optimal control based algorithms for energy management of automotive power systems with battery/supercapacitor storage devices , 2014 .

[58]  Derong Liu,et al.  Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[59]  Frank L. Lewis,et al.  Data-Driven Flotation Industrial Process Operational Optimal Control Based on Reinforcement Learning , 2018, IEEE Transactions on Industrial Informatics.