Model-Free Optimal Control of Linear Multiagent Systems via Decomposition and Hierarchical Approximation

Designing the optimal linear quadratic regulator (LQR) for a large-scale multiagent system is time consuming since it involves solving a large-size matrix Riccati equation. The situation is further exasperated when the design needs to be done in a model-free way using schemes such as reinforcement learning (RL). To reduce this computational complexity, we decompose the large-scale LQR design problem into multiple small-size LQR design problems. We consider the objective function to be specified over an undirected graph, and cast the decomposition as a graph clustering problem. The graph is decomposed into two parts, one consisting of independent clusters of connected components, and the other containing edges that connect different clusters. Accordingly, the resulting controller has a hierarchical structure, consisting of two components. The first component optimizes the performance of each independent cluster by solving the small-size LQR design problem in a model-free way using an RL algorithm. The second component accounts for the objective coupling different clusters, which is achieved by solving a least-squares problem in one shot. Although suboptimal, the hierarchical controller adheres to a particular structure as specified by interagent couplings in the objective function and by the decomposition strategy. Mathematical formulations are established to find a decomposition that minimizes the number of required communication links or reduces the optimality gap. Numerical simulations are provided to highlight the pros and cons of the proposed designs.

[1]  J. Willems Least squares stationary optimal control and the algebraic Riccati equation , 1971 .

[2]  Francesco Borrelli,et al.  Decentralized receding horizon control for large scale dynamically decoupled systems , 2009, Autom..

[3]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[4]  Jacquelien M.A. Scherpen,et al.  Clustering-Based Model Reduction of Laplacian Dynamics With Weakly Connected Topology , 2020, IEEE Transactions on Automatic Control.

[5]  Aranya Chakrabortty,et al.  Game-Theoretic Multi-Agent Control and Network Cost Allocation Under Communication Constraints , 2016, IEEE Journal on Selected Areas in Communications.

[6]  Antoine Girard,et al.  Clustered model reduction of positive directed networks , 2015, Autom..

[7]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[8]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[9]  Benjamin Recht,et al.  A Tour of Reinforcement Learning: The View from Continuous Control , 2018, Annu. Rev. Control. Robotics Auton. Syst..

[10]  John T. Wen,et al.  Cooperative Control Design - A Systematic, Passivity-Based Approach , 2011, Communications and control engineering.

[11]  Bhiksha Raj,et al.  Greedy sparsity-constrained optimization , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[12]  Benjamin L. Francis,et al.  Information Geometry and Model Reduction in Oscillatory and Networked Systems , 2020 .

[13]  Frank L. Lewis,et al.  Reinforcement Learning and Approximate Dynamic Programming for Feedback Control , 2012 .

[14]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[15]  Ian Postlethwaite,et al.  A distributed control law with guaranteed LQR cost for identical dynamically coupled linear systems , 2011, Proceedings of the 2011 American Control Conference.

[16]  Long Wang,et al.  Consensus of Multiagent Systems With Distance-Dependent Communication Networks , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Joe H. Chow,et al.  Power system disturbance identification from recorded dynamic data at the Northfield substation , 2003 .

[18]  Frank L. Lewis,et al.  Cooperative Optimal Control for Multi-Agent Systems on Directed Graph Topologies , 2014, IEEE Transactions on Automatic Control.

[19]  Silvia Ferrari,et al.  Distributed optimal control for multi-agent trajectory optimization , 2014, Autom..

[20]  J. Zico Kolter,et al.  A Fast Algorithm for Sparse Controller Design , 2013, ArXiv.

[21]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[22]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[23]  Jiangping Hu,et al.  Tracking control for multi-agent consensus with an active leader and variable topology , 2006, Autom..

[24]  Dorit S. Hochbaum,et al.  Polynomial algorithm for the k-cut problem , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[25]  R. Penrose On best approximate solutions of linear matrix equations , 1956, Mathematical Proceedings of the Cambridge Philosophical Society.

[26]  Xiaodong Cheng,et al.  Synchronization preserving model reduction of multi-agent network systems by eigenvalue assignments , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[27]  Randal W. Beard,et al.  Distributed Consensus in Multi-vehicle Cooperative Control - Theory and Applications , 2007, Communications and Control Engineering.

[28]  He Bai,et al.  Hierarchical Control of Multi-Agent Systems using Online Reinforcement Learning , 2020, 2020 American Control Conference (ACC).

[29]  Fu Lin,et al.  Design of Optimal Sparse Feedback Gains via the Alternating Direction Method of Multipliers , 2011, IEEE Transactions on Automatic Control.

[30]  Shang-Hua Teng,et al.  Spectral sparsification of graphs: theory and algorithms , 2013, CACM.

[31]  Francesco Borrelli,et al.  Distributed LQR Design for Identical Dynamically Decoupled Systems , 2008, IEEE Transactions on Automatic Control.

[32]  Laurent El Ghaoui,et al.  Graph Weight Allocation to Meet Laplacian Spectral Constraints , 2012, IEEE Transactions on Automatic Control.

[33]  Tamer Basar,et al.  Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.

[34]  Aranya Chakrabortty,et al.  On Model-Free Reinforcement Learning of Reduced-Order Optimal Control for Singularly Perturbed Systems , 2018, 2018 IEEE Conference on Decision and Control (CDC).

[35]  B. Anderson,et al.  Optimal control: linear quadratic methods , 1990 .

[36]  G. Ribiere,et al.  Experiments in mixed-integer linear programming , 1971, Math. Program..

[37]  Kevin M. Passino,et al.  Cohesive Behaviors of Multiagent Systems With Information Flow Constraints , 2006, IEEE Transactions on Automatic Control.

[38]  Asuman E. Ozdaglar,et al.  Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.

[39]  Tatsuo Narikiyo,et al.  Hierarchical Decentralized Robust Optimal Design for Homogeneous Linear Multi-Agent Systems , 2016, ArXiv.

[40]  Dorit S. Hochbaum,et al.  A Polynomial Algorithm for the k-cut Problem for Fixed k , 1994, Math. Oper. Res..

[41]  Frank L. Lewis,et al.  Optimal and Autonomous Control Using Reinforcement Learning: A Survey , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[42]  Yongcan Cao,et al.  Optimal Linear-Consensus Algorithms: An LQR Perspective , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[43]  Yuguang Fang,et al.  New estimates for solutions of Lyapunov equations , 1997, IEEE Trans. Autom. Control..

[44]  Stefan M. Wild,et al.  Benchmarking Derivative-Free Optimization Algorithms , 2009, SIAM J. Optim..

[45]  Aranya Chakrabortty,et al.  Model-Free Reinforcement Learning of Minimal-Cost Variance Control , 2020, IEEE Control Systems Letters.

[46]  Veronica Adetola,et al.  Sparse Output Feedback Synthesis via Proximal Alternating Linearization Method , 2017, 1706.08191.

[47]  M. Kanat Camlibel,et al.  Projection-Based Model Reduction of Multi-Agent Systems Using Graph Partitions , 2014, IEEE Transactions on Control of Network Systems.