Reinforcement Learning of Structured Control for Linear Systems with Unknown State Matrix

This paper delves into designing stabilizing feedback control gains for continuous linear systems with unknown state matrix, in which the control is subject to a general structural constraint. We bring forth the ideas from reinforcement learning (RL) in conjunction with sufficient stability and performance guarantees in order to design these structured gains using the trajectory measurements of states and controls. We first formulate a model-based framework using dynamic programming (DP) to embed the structural constraint to the Linear Quadratic Regulator (LQR) gain computation in the continuous-time setting. Subsequently, we transform this LQR formulation into a policy iteration RL algorithm that can alleviate the requirement of known state matrix in conjunction with maintaining the feedback gain structure. Theoretical guarantees are provided for stability and convergence of the structured RL (SRL) algorithm. The introduced RL framework is general and can be applied to any control structure. A special control structure enabled by this RL framework is distributed learning control which is necessary for many large-scale cyber-physical systems. As such, we validate our theoretical results with numerical simulations on a multi-agent networked linear time-invariant (LTI) dynamic system.

[1]  Lubomír Bakule,et al.  Decentralized control: An overview , 2008, Annu. Rev. Control..

[2]  S. Lall,et al.  Quadratic invariance is necessary and sufficient for convexity , 2011, Proceedings of the 2011 American Control Conference.

[3]  Cédric Langbort,et al.  Distributed control design for systems interconnected over an arbitrary graph , 2004, IEEE Transactions on Automatic Control.

[4]  Frank L. Lewis,et al.  Optimal Control , 1986 .

[5]  Dimitri P. Bertsekas,et al.  Approximate Dynamic Programming , 2017, Encyclopedia of Machine Learning and Data Mining.

[6]  Sanjay Lall,et al.  A Characterization of Convex Problems in Decentralized Control$^ast$ , 2005, IEEE Transactions on Automatic Control.

[7]  D. Kleinman On an iterative technique for Riccati equation computations , 1968 .

[8]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[9]  Aranya Chakrabortty,et al.  On Model-Free Reinforcement Learning of Reduced-Order Optimal Control for Singularly Perturbed Systems , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[10]  Frederik Deroo Control of interconnected systems with distributed model knowledge , 2016 .

[11]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Ji-guang Sun Perturbation Theory for Algebraic Riccati Equations , 1998, SIAM J. Matrix Anal. Appl..

[13]  M. Fardad,et al.  Sparsity-promoting optimal control for a class of distributed systems , 2011, Proceedings of the 2011 American Control Conference.

[14]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[15]  Michel Verhaegen,et al.  Distributed Control for Identical Dynamically Coupled Systems: A Decomposition Approach , 2009, IEEE Transactions on Automatic Control.

[16]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[17]  Maryam Kamgarpour,et al.  Learning the Globally Optimal Distributed LQ Regulator , 2019, L4DC.

[18]  Kyriakos G. Vamvoudakis,et al.  Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach , 2017, Syst. Control. Lett..

[19]  Aranya Chakrabortty,et al.  Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations , 2020, ArXiv.

[20]  Zhong-Ping Jiang,et al.  Robust Adaptive Dynamic Programming for Large-Scale Systems With an Application to Multimachine Power Systems , 2012, IEEE Transactions on Circuits and Systems II: Express Briefs.

[21]  Petros A. Ioannou,et al.  Adaptive control tutorial , 2006, Advances in design and control.

[22]  Aranya Chakrabortty,et al.  Block-Decentralized Model-Free Reinforcement Learning Control of Two Time-Scale Networks , 2019, 2019 American Control Conference (ACC).

[23]  J. C. Geromel,et al.  Structural Constrained Controllers for Linear Discrete Dynamic Systems , 1984 .

[24]  Zhong-Ping Jiang,et al.  Robust Adaptive Dynamic Programming , 2017 .