Off-policy reinforcement learning for distributed output synchronization of linear multi-agent systems

In this paper, off-policy reinforcement learning (RL) is used to find a model-free optimal solution to the H∞ output synchronization of heterogeneous multi-agent discrete-time systems. First, the output synchronization problem is formulated as a set of local optimal tracking problems. It is shown that optimal local synchronization control protocols can be found by solving augmented game algebraic Riccati equations (GAREs). The solutions to the GAREs require the state of the leader for all agents and the knowledge of agent dynamics. To obviate this requirement, a distributed adaptive observer is designed to estimate the leader state for all agents without requiring complete knowledge of the leader dynamics. Moreover, off-policy RL algorithm is used to learn the solution to the GAREs using only measured data and without requiring the knowledge of the agent or the leader dynamics. In the proposed approach, in contrast to other model free approaches, the disturbance input does not need to be adjusted in a specific manner. A simulation example is given to show the effectiveness of the proposed method.

[1]  Jie Huang,et al.  Global robust output regulation of lower triangular systems with unknown control direction , 2008, Autom..

[2]  Zhong-Ping Jiang,et al.  Distributed output regulation of leader–follower multi‐agent systems , 2013 .

[3]  Huaguang Zhang,et al.  Distributed-observer-based cooperative control for synchronization of linear discrete-time multi-agent systems. , 2015, ISA transactions.

[4]  Frank L. Lewis,et al.  Synchronization of discrete-time multi-agent systems on graphs using Riccati design , 2012, Autom..

[5]  Frank Allgöwer,et al.  An internal model principle is necessary and sufficient for linear output synchronization , 2011, Autom..

[6]  D. Bernstein Matrix Mathematics: Theory, Facts, and Formulas , 2009 .

[7]  Jie Lin,et al.  Coordination of groups of mobile autonomous agents using nearest neighbor rules , 2003, IEEE Trans. Autom. Control..

[8]  Jie Huang,et al.  A general framework for tackling the output regulation problem , 2004, IEEE Transactions on Automatic Control.

[9]  Randal W. Beard,et al.  Distributed Consensus in Multi-vehicle Cooperative Control - Theory and Applications , 2007, Communications and Control Engineering.

[10]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11]  Gang Feng,et al.  Output Consensus of Heterogeneous Linear Discrete-Time Multiagent Systems With Structural Uncertainties , 2015, IEEE Transactions on Cybernetics.

[12]  Frank L. Lewis,et al.  Cooperative Control of Multi-Agent Systems: Optimal and Adaptive Design Approaches , 2013 .

[13]  Ali Saberi,et al.  Output synchronization for heterogeneous networks of introspective right‐invertible agents , 2014 .

[14]  Ji Xiang,et al.  Synchronized Output Regulation of Linear Networked Systems , 2009, IEEE Transactions on Automatic Control.

[15]  Frank L. Lewis,et al.  H∞ control of linear discrete-time systems: Off-policy reinforcement learning , 2017, Autom..

[16]  Richard M. Murray,et al.  Consensus problems in networks of agents with switching topology and time-delays , 2004, IEEE Transactions on Automatic Control.

[17]  Frank L. Lewis,et al.  H∞ optimal control of unknown linear discrete-time systems: An off-policy reinforcement learning approach , 2015, 2015 IEEE 7th International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE Conference on Robotics, Automation and Mechatronics (RAM).

[18]  Lin Huang,et al.  Consensus of Multiagent Systems and Synchronization of Complex Networks: A Unified Viewpoint , 2016, IEEE Transactions on Circuits and Systems I: Regular Papers.