Model-Free Reinforcement Learning of Minimal-Cost Variance Control

This letter proposes two reinforcement learning (RL) algorithms for solving a class of coupled algebraic Riccati equations (CARE) for linear stochastic dynamic systems with unknown state and input matrices. The CARE are formulated for a minimal-cost variance (MCV) control problem that aims to minimize the variance of a cost function while keeping its mean at an acceptable range using a noisy infinite-horizon full-state feedback linear quadratic regulator (LQR). We propose two RL algorithms where the input matrix can be estimated at the very first iteration. This, in turn, frees up significant amount of computational complexity in the intermediate steps of the learning phase by avoiding repeated matrix inversion of a high-dimensional data matrix. The overall complexity is shown to be less than RL for both stochastic and deterministic LQR. Additionally, the disturbance noise entering the model is not required to satisfy any condition for ensuring efficiency of either RL algorithms. Simulation examples are presented to illustrate the effectiveness of the two designs.

[1]  Michael Kent Sain On Minimal-Variance Control of Linear Systems With Quadratic Loss , 1965 .

[2]  L. Cherfi New results in the Lyapunov-type algorithm for algebraic MCV Riccati equations , 2010, Math. Comput. Model..

[3]  D. Williams STOCHASTIC DIFFERENTIAL EQUATIONS: THEORY AND APPLICATIONS , 1976 .

[4]  Kyriakos G. Vamvoudakis,et al.  Dynamic intermittent Q ‐learning–based model‐free suboptimal co‐design of ‐stabilization , 2019, International Journal of Robust and Nonlinear Control.

[5]  Khanh D. Pham Assured satellite communications: A minimal-cost-variance system controller paradigm , 2016, 2016 American Control Conference (ACC).

[6]  Khanh Dai Pham,et al.  Statistical Control Paradigms for Structural Vibration Supression , 2004 .

[7]  Chang-Hee Won,et al.  Statistical Stackelberg game control: Open-loop minimal cost variance case , 2019, Autom..

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Seung-Rae Lee,et al.  Coupled matrix Riccati equations in minimal cost variance control problems , 1999, IEEE Trans. Autom. Control..

[10]  M. Sain,et al.  Infinite-time minimal cost variance control and coupled algebraic Riccati equations , 2003, Proceedings of the 2003 American Control Conference, 2003..

[11]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[12]  D. Kleinman On an iterative technique for Riccati equation computations , 1968 .

[13]  M. Sain,et al.  Cumulants in risk-sensitive control: the full-state-feedback cost variance case , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.

[14]  Yixin Yin,et al.  Hamiltonian-Driven Adaptive Dynamic Programming for Continuous Nonlinear Dynamical Systems , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[16]  Ioan Filip,et al.  Self-tuning strategy for a minimum variance control system of a highly disturbed process , 2019, Eur. J. Control.

[17]  Evangelos Theodorou,et al.  Stochastic control of systems with control multiplicative noise using second order FBSDEs , 2017, 2017 American Control Conference (ACC).

[18]  He Bai,et al.  Hierarchical Control of Multi-Agent Systems using Online Reinforcement Learning , 2020, 2020 American Control Conference (ACC).

[19]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[20]  Aranya Chakrabortty,et al.  On Model-Free Reinforcement Learning of Reduced-Order Optimal Control for Singularly Perturbed Systems , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[21]  Zhong-Ping Jiang,et al.  Adaptive Dynamic Programming for Stochastic Systems With State and Control Dependent Noise , 2016, IEEE Transactions on Automatic Control.