Reinforcement Learning based Design of Linear Fixed Structure Controllers

Reinforcement learning has been successfully applied to the problem of tuning PID controllers in several applications. The existing methods often utilize function approximation, such as neural networks, to update the controller parameters at each time-step of the underlying process. In this work, we present a simple finite-difference approach, based on random search, to tuning linear fixed-structure controllers. For clarity and simplicity, we focus on PID controllers. Our algorithm operates on the entire closed-loop step response of the system and iteratively improves the PID gains towards a desired closed-loop response. This allows for embedding stability requirements into the reward function without any modeling procedures.

[1]  Mariano De Paula,et al.  Incremental Q-learning strategy for adaptive PID control of mobile robots , 2017, Expert Syst. Appl..

[2]  Sham M. Kakade,et al.  Towards Generalization and Simplicity in Continuous Control , 2017, NIPS.

[3]  Alireza Rezazadeh,et al.  Adaptive PID Controller based on Reinforcement Learning for Wind Turbine Control , 2008 .

[4]  Sigurd Skogestad,et al.  Probably the best simple PID tuning rules in the world , 2001 .

[5]  Biao Huang,et al.  A Novel Approach to Feedback Control with Deep Reinforcement Learning , 2018 .

[6]  Jay H. Lee,et al.  Neuro-dynamic programming method for MPC 1 , 2001 .

[7]  Karl Johan Åström,et al.  A Stabilizing Switching Scheme for Multi Controller Systems , 1996 .

[8]  Jong Min Lee,et al.  Dynamic tuning of PI-controllers based on model-free Reinforcement Learning methods , 2010, ICCAS 2010.

[9]  Philip Bachman,et al.  Deep Reinforcement Learning that Matters , 2017, AAAI.

[10]  Peter Henderson,et al.  Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.

[11]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[12]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[13]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[14]  Haw-Yun Shin,et al.  An adaptive PID controller , 2010, 2010 International Conference on Machine Learning and Cybernetics.

[15]  Susan M. Drake A Novel Approach. , 1996 .

[16]  C. Peota Novel approach. , 2011, Minnesota medicine.

[17]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[18]  V. Venkatasubramanian The promise of artificial intelligence in chemical engineering: Is it here, finally? , 2018, AIChE Journal.

[19]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[20]  Sean Price,et al.  The Promise of Artificial Intelligence. , 2019, Texas medicine.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Jong Min Lee,et al.  Value function-based approach to the scheduling of multiple controllers , 2008 .

[23]  R. Bhushan Gopaluni,et al.  Deep Reinforcement Learning for Process Control: A Primer for Beginners , 2019, AIChE Journal.

[24]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Benjamin Recht,et al.  Simple random search of static linear policies is competitive for reinforcement learning , 2018, NeurIPS.