Algorithm for Autonomous Power-Increase Operation Using Deep Reinforcement Learning and a Rule-Based System

The power start-up operation of a nuclear power plant (NPP) increases the reactor power to the full-power condition for electricity generation. Compared to full-power operation, the power-increase operation requires significantly more decision-making and therefore increases the potential for human errors. While previous studies have investigated the use of artificial intelligence (AI) techniques for NPP control, none of them have addressed making the relatively complicated power-increase operation fully autonomous. This study focused on developing an algorithm for converting all the currently manual activities in the NPP power-increase process to autonomous operations. An asynchronous advantage actor-critic, which is a type of deep reinforcement learning method, and a long short-term memory network were applied to the operator tasks for which establishing clear rules or logic was challenging, while a rule-based system was developed for those actions, which could be described by simple logic (such as if-then logic). The proposed autonomous power-increase control algorithm was trained and validated using a compact nuclear simulator (CNS). The simulation results were used to evaluate the algorithm’s ability to control the parameters within allowable limits, and the proposed power-increase control algorithm was proven capable of identifying an acceptable operation path for increasing the reactor power from 2% to 100% at a specified rate of power increase. In addition, the pattern of operation that resulted from the autonomous control simulation was found to be identical to that of the established operation strategy. These results demonstrate the potential feasibility of fully autonomous control of the NPP power-increase operation.

[1]  Yuxi Li,et al.  Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[2]  Lin Sun,et al.  Novel fuzzy logic based coordinated control for multi-unit small modular reactor , 2019 .

[3]  Fei Lin,et al.  Deep-Reinforcement-Learning-Based Energy Management Strategy for Supercapacitor Energy Storage Systems in Urban Rail Transit , 2021, IEEE Transactions on Intelligent Transportation Systems.

[4]  Mathew M. Noel,et al.  Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach , 2014, Appl. Soft Comput..

[5]  Stuart Bennett,et al.  The past of pid controllers , 2000, Annu. Rev. Control..

[6]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[7]  Jaemin Yang,et al.  Conceptual design of autonomous emergency operation system for nuclear power plants and its prototype , 2020 .

[8]  Jonghyun Kim,et al.  Autonomous Algorithm for Start-Up Operation of Nuclear Power Plants by Using LSTM , 2018 .

[9]  Seyed Mohammad Hossein Mousakazemi Control of a PWR nuclear reactor core power using scheduled PID controller with GA, based on two-point kinetics model and adaptive disturbance rejection system , 2019, Annals of Nuclear Energy.

[10]  H. Basher Autonomous Control of Nuclear Power Plants , 2003 .

[11]  Albert Y. Zomaya,et al.  Reinforcement learning in sustainable energy and electric systems: a survey , 2020, Annu. Rev. Control..

[12]  Kee-Choon Kwon,et al.  Development of advanced I&C in nuclear power plants: ADIOS and ASICS , 2001 .

[13]  Jiqiang Liu,et al.  Gradient Band-based Adversarial Training for Generalized Attack Immunity of A3C Path Finding , 2018, ArXiv.

[14]  B. R. Upadhyaya,et al.  A neuro-fuzzy controller for axial power distribution an nuclear reactors , 1998 .

[15]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[16]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[17]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[18]  Poong Hyun Seong,et al.  Study on the identification of main drivers affecting the performance of human operators during low power and shutdown operation , 2016 .

[19]  K.Y. Lee,et al.  Fuzzy-adapted recursive sliding-mode controller design for a nuclear power plant control , 2004, IEEE Transactions on Nuclear Science.

[20]  Hossein Arabalibeik,et al.  Adaptive control of a PWR core power using neural networks , 2005 .

[21]  Poong Hyun Seong,et al.  Development of automated operating procedure system using fuzzy colored petri nets for nuclear power plants , 2004 .

[22]  Hassan Salarieh,et al.  Design of a fault tolerated intelligent control system for load following operation in a nuclear power plant , 2016 .

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  Koichi Sekimizu,et al.  Knowledge Representation for Automated Boiling Water Reactor Plant Startup , 1992 .

[25]  Jinsen Xie,et al.  A fuzzy-PID composite controller for core power control of liquid molten salt reactor , 2020 .

[26]  Wenjie Zeng,et al.  Study on switching control of PWR core power with a fuzzy multimodel , 2020 .

[27]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[28]  Mohammad Javad Yazdanpanah,et al.  An intelligent nuclear reactor core controller for load following operations, using recurrent neural networks and fuzzy systems , 2003 .

[29]  Javier Lozano Silva Artificial Neural Network Based Reinforcement Learning for Wind Turbine Yaw Control , 2019 .

[30]  Wei Gu,et al.  Combined heat and power system intelligent economic dispatch: A deep reinforcement learning approach , 2020, International Journal of Electrical Power & Energy Systems.

[31]  Mariano De Paula,et al.  Controlling blood glucose variability under uncertainty using reinforcement learning and Gaussian processes , 2015, Appl. Soft Comput..

[32]  Rei Kawakami,et al.  BIRD DETECTION NEAR WIND TURBINES FROM HIGH-RESOLUTION VIDEO USING LSTM NETWORKS , 2016 .

[33]  Ramazan Coban Power level control of the TRIGA Mark-II research reactor using the multifeedback layer neural network and the particle swarm optimization , 2014 .

[34]  Tianshu Wei,et al.  Deep reinforcement learning for building HVAC control , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[35]  Man Gyun Na,et al.  A Model Predictive Controller for Nuclear Reactor Power , 2003 .

[36]  Armando Segovia de los Ríos,et al.  A stable adaptive fuzzy control scheme for tracking an optimal power profile in a research nuclear reactor , 2013 .

[37]  Seyed Mohammad Hossein Mousakazemi Computational effort comparison of genetic algorithm and particle swarm optimization algorithms for the proportional–integral–derivative controller tuning of a pressurized water nuclear reactor , 2020 .

[38]  Ben Tse,et al.  Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.

[39]  Anne Gabrielle Eva Collins,et al.  Reinforcement learning: bringing together computation and cognition , 2019, Current Opinion in Behavioral Sciences.

[40]  Xiaoxiao Guo Deep Learning and Reward Design for Reinforcement Learning , 2017 .

[41]  Juho Kannala,et al.  Learning to Drive Small Scale Cars from Scratch , 2020, ArXiv.

[42]  Jong-Hyun Kim,et al.  Accident diagnosis algorithm with untrained accident identification during power-increasing operation , 2020, Reliab. Eng. Syst. Saf..

[43]  Belle R. Upadhyaya,et al.  Autonomous Control of Space Reactor Systems , 2007 .

[44]  Kwang Y. Lee,et al.  Multiobjective optimal power plant operation through coordinate control with pressure set point scheduling , 2001 .

[45]  Reza Ebrahimpour,et al.  Decentralized multi-agent based energy management of microgrid using reinforcement learning , 2020 .

[46]  Daeil Lee,et al.  Autonomous operation algorithm for safety systems of nuclear power plants by using long-short term memory and function-based hierarchical framework , 2018, Annals of Nuclear Energy.

[47]  Emine Ayaz,et al.  Elman's recurrent neural network applications to condition monitoring in nuclear power plant and rotating machinery , 2003 .

[48]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[49]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[50]  Caro Lucas,et al.  Emotional learning based intelligent controller for a PWR nuclear reactor core during load following operation , 2008 .

[51]  Henryk Michalewski,et al.  Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes , 2018, ISC.

[52]  Shigenobu Kobayashi,et al.  Power plant start-up scheduling: a reinforcement learning approach combined with evolutionary computation , 1998, J. Intell. Fuzzy Syst..

[53]  Jaemin Yang,et al.  An accident diagnosis algorithm using long short-term memory , 2018 .

[54]  Milan Aggarwal,et al.  Improving Search Through A3C Reinforcement Learning Based Conversational Agent , 2017, ICCS.

[55]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[56]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[57]  R. T. Wood,et al.  Autonomous Control Capabilities for Space Reactor Power Systems , 2004 .