论文信息 - Average Cost Optimal Control of Stochastic Systems Using Reinforcement Learning

Average Cost Optimal Control of Stochastic Systems Using Reinforcement Learning

This paper addresses the average cost minimization problem for discrete-time systems with multiplicative and additive noises via reinforcement learning. By using Q-function, we propose an online learning scheme to estimate the kernel matrix of Q-function and to update the control gain using the data along the system trajectories. The obtained control gain and kernel matrix are proved to converge to the optimal ones. To implement the proposed learning scheme, an online model-free reinforcement learning algorithm is given, where recursive least squares method is used to estimate the kernel matrix of Q-function. A numerical example is presented to illustrate the proposed approach.

Junlin Xiong | Jing Lai | J. Xiong | J. Lai

[1] Aranya Chakrabortty,et al. Model-Free Reinforcement Learning of Minimal-Cost Variance Control , 2020, IEEE Control Systems Letters.

[2] Meng Zhang,et al. Data-driven adaptive optimal control for stochastic systems with unmeasurable state , 2020, Neurocomputing.

[3] Huaguang Zhang,et al. Stochastic linear quadratic optimal control for model-free discrete-time systems based on Q-learning algorithm , 2018, Neurocomputing.

[4] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[5] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[6] Graham C. Goodwin,et al. Adaptive filtering prediction and control , 1984 .

[7] Fredrik Gustafsson,et al. Using Reinforcement Learning for Model-free Linear Quadratic Control with Process and Measurement Noises , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[8] Frank L. Lewis,et al. Game Theory-Based Control System Algorithms with Real-Time Reinforcement Learning: How to Solve Multiplayer Games Online , 2017, IEEE Control Systems.

[9] Frank L. Lewis,et al. Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning , 2016, IEEE Transactions on Cybernetics.

[10] Tyler Summers,et al. Policy Iteration for Linear Quadratic Games With Stochastic Parameters , 2021, IEEE Control Systems Letters.

[11] Junlin Xiong,et al. Model-free optimal control of discrete-time systems with additive and multiplicative noises , 2020, Autom..

[12] Lucian Busoniu,et al. Reinforcement learning for control: Performance, stability, and deep approximators , 2018, Annu. Rev. Control..

[13] D. Kleinman,et al. Optimal stationary control of linear systems with control-dependent noise , 1969 .

[14] Nevena Lazic,et al. Model-Free Linear Quadratic Control via Reduction to Expert Prediction , 2018, AISTATS.

[15] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[16] Zhong-Ping Jiang,et al. Adaptive Dynamic Programming for Stochastic Systems With State and Control Dependent Noise , 2016, IEEE Transactions on Automatic Control.

[17] Bian Tao,et al. Adaptive optimal control for linear stochastic systems with additive noise , 2015, 2015 34th Chinese Control Conference (CCC).

[18] Benjamin Recht,et al. A Tour of Reinforcement Learning: The View from Continuous Control , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[19] Derong Liu,et al. Output Tracking Control Based on Adaptive Dynamic Programming With Multistep Policy Evaluation , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[20] Dimitri P. Bertsekas,et al. Convergence Results for Some Temporal Difference Methods Based on Least Squares , 2009, IEEE Transactions on Automatic Control.

[21] Zhong-Ping Jiang,et al. Continuous-Time Robust Dynamic Programming , 2018, SIAM J. Control. Optim..

[22] Michael I. Jordan,et al. Optimal feedback control as a theory of motor coordination , 2002, Nature Neuroscience.

[23] Andrew G. Barto,et al. Adaptive linear quadratic control using policy iteration , 1994, Proceedings of 1994 American Control Conference - ACC '94.

[24] Daniel E. Quevedo,et al. DeepCAS: A Deep Reinforcement Learning Algorithm for Control-Aware Scheduling , 2018, IEEE Control Systems Letters.

[25] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.