Leader–Follower Output Synchronization of Linear Heterogeneous Systems With Active Leader Using Reinforcement Learning

This paper develops optimal control protocols for the distributed output synchronization problem of leader–follower multiagent systems with an active leader. Agents are assumed to be heterogeneous with different dynamics and dimensions. The desired trajectory is assumed to be preplanned and is generated by the leader. Other follower agents autonomously synchronize to the leader by interacting with each other using a communication network. The leader is assumed to be active in the sense that it has a nonzero control input so that it can act independently and update its control to keep the followers away from possible danger. A distributed observer is first designed to estimate the leader’s state and generate the reference signal for each follower. Then, the output synchronization of leader–follower systems with an active leader is formulated as a distributed optimal tracking problem, and inhomogeneous algebraic Riccati equations (AREs) are derived to solve it. The resulting distributed optimal control protocols not only minimize the steady-state error but also optimize the transient response of the agents. An off-policy reinforcement learning algorithm is developed to solve the inhomogeneous AREs online in real time and without requiring any knowledge of the agents’ dynamics. Finally, two simulation examples are conducted to illustrate the effectiveness of the proposed algorithm.

[1]  Guo-Xing Wen,et al.  Adaptive Consensus Control for a Class of Nonlinear Multiagent Time-Delay Systems Using Neural Networks , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Jennie Si,et al.  Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[3]  Guoqiang Hu,et al.  The adaptive distributed observer approach to the cooperative output regulation of linear multi-agent systems , 2017, Autom..

[4]  Lorenzo Marconi,et al.  Robust Output Synchronization of a Network of Heterogeneous Nonlinear Agents Via Nonlinear Regulation Theory , 2014, IEEE Transactions on Automatic Control.

[5]  Haibo He,et al.  GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Lihua Xie,et al.  Distributed Tracking Control for Linear Multiagent Systems With a Leader of Bounded Unknown Input , 2013, IEEE Transactions on Automatic Control.

[7]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[8]  Frank L. Lewis,et al.  Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning , 2014, IEEE Transactions on Automatic Control.

[9]  Richard M. Murray,et al.  Consensus problems in networks of agents with switching topology and time-delays , 2004, IEEE Transactions on Automatic Control.

[10]  Long Cheng,et al.  Containment Control of Multiagent Systems With Dynamic Leaders Based on a $PI^{n}$ -Type Approach , 2014, IEEE Transactions on Cybernetics.

[11]  Derong Liu,et al.  Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Qichao Zhang,et al.  Experience Replay for Optimal Control of Nonzero-Sum Game Systems With Unknown Dynamics , 2016, IEEE Transactions on Cybernetics.

[13]  Xiao Fan Wang,et al.  Flocking of Multi-Agents With a Virtual Leader , 2009, IEEE Trans. Autom. Control..

[14]  Frank L. Lewis,et al.  Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning , 2016, IEEE Transactions on Cybernetics.

[15]  Wei Ren,et al.  Distributed consensus of linear multi-agent systems with adaptive dynamic protocols , 2011, Autom..

[16]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[17]  Magnus Egerstedt,et al.  Distributed containment control with multiple stationary or dynamic leaders in fixed and switching directed networks , 2012, Autom..

[18]  Haibo He,et al.  A three-network architecture for on-line learning and optimization based on adaptive dynamic programming , 2012, Neurocomputing.

[19]  Frank L. Lewis,et al.  Lyapunov, Adaptive, and Optimal Design Techniques for Cooperative Systems on Directed Communication Graphs , 2012, IEEE Transactions on Industrial Electronics.

[20]  Luigi Fortuna,et al.  Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control , 2009 .

[21]  Haibo He,et al.  Goal Representation Heuristic Dynamic Programming on Maze Navigation , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[22]  N. H. C. Yung,et al.  A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[23]  W. Ren Consensus strategies for cooperative control of vehicle formations , 2007 .

[24]  Gang Sun,et al.  Distributed Neural Network Control for Adaptive Synchronization of Uncertain Dynamical Multiagent Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Haibo He,et al.  Gr-GDHP: A New Architecture for Globalized Dual Heuristic Dynamic Programming , 2017, IEEE Transactions on Cybernetics.

[26]  Hyung Suck Cho,et al.  A sensor-based navigation for a mobile robot using fuzzy logic and reinforcement learning , 1995, IEEE Trans. Syst. Man Cybern..

[27]  Wei Ren,et al.  Information consensus in multivehicle cooperative control , 2007, IEEE Control Systems.

[28]  Frank L. Lewis,et al.  Output synchronization of heterogeneous discrete-time systems: A model-free optimal approach , 2017, Autom..

[29]  Frank L. Lewis,et al.  Optimal Design for Synchronization of Cooperative Systems: State Feedback, Observer and Output Feedback , 2011, IEEE Transactions on Automatic Control.

[30]  Long Cheng,et al.  Decentralized Robust Adaptive Control for the Multiagent System Consensus Problem Using Neural Networks , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[31]  Jiangping Hu,et al.  Tracking control for multi-agent consensus with an active leader and variable topology , 2006, Autom..

[32]  Qinglai Wei,et al.  Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..

[33]  Jie Huang,et al.  3. Nonlinear Output Regulation , 2004 .

[34]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[35]  Haibo He,et al.  A Theoretical Foundation of Goal Representation Heuristic Dynamic Programming , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Randal W. Beard,et al.  Distributed Consensus in Multi-vehicle Cooperative Control - Theory and Applications , 2007, Communications and Control Engineering.

[37]  Zhong-Ping Jiang,et al.  A Distributed Control Approach to A Robust Output Regulation Problem for Multi-Agent Linear Systems , 2010, IEEE Transactions on Automatic Control.

[38]  Xinghu Wang,et al.  Distributed output regulation for a class of nonlinear multi-agent systems with unknown-input leaders , 2015, Autom..

[39]  Zhisheng Duan,et al.  Cooperative Control of Multi-Agent Systems: A Consensus Region Approach , 2014 .

[40]  Haibo He,et al.  Formation Learning Control of Multiple Autonomous Underwater Vehicles With Heterogeneous Nonlinear Uncertain Dynamics , 2018, IEEE Transactions on Cybernetics.

[41]  Frank L. Lewis,et al.  Cooperative Control of Multi-Agent Systems: Optimal and Adaptive Design Approaches , 2013 .

[42]  Frank L. Lewis,et al.  Distributed Control Systems for Small-Scale Power Networks: Using Multiagent Cooperative Control Theory , 2014, IEEE Control Systems.

[43]  Lin Huang,et al.  Consensus of Multiagent Systems and Synchronization of Complex Networks: A Unified Viewpoint , 2016, IEEE Transactions on Circuits and Systems I: Regular Papers.

[44]  Eduardo Zalama Casanova,et al.  Adaptive behavior navigation of a mobile robot , 2002, IEEE Trans. Syst. Man Cybern. Part A.

[45]  Frank L. Lewis,et al.  Distributed adaptive control for synchronization of unknown nonlinear networked systems , 2010, Autom..

[46]  Juan C. Vasquez,et al.  Secondary Frequency and Voltage Control of Islanded Microgrids via Distributed Averaging , 2015, IEEE Transactions on Industrial Electronics.

[47]  Warren E. Dixon,et al.  Approximate optimal trajectory tracking for continuous-time nonlinear systems , 2013, Autom..

[48]  Jie Huang,et al.  Cooperative Output Regulation of Linear Multi-Agent Systems , 2012, IEEE Transactions on Automatic Control.

[49]  Yixin Yin,et al.  Hamiltonian-Driven Adaptive Dynamic Programming for Continuous Nonlinear Dynamical Systems , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[50]  Amit Konar,et al.  A Deterministic Improved Q-Learning for Path Planning of a Mobile Robot , 2013, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[51]  Derong Liu,et al.  Neural-Network-Based Distributed Adaptive Robust Control for a Class of Nonlinear Multiagent Systems With Time Delays and External Noises , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[52]  Frank L. Lewis,et al.  Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning , 2016, Autom..

[53]  Long Cheng,et al.  Neural-Network-Based Adaptive Leader-Following Control for Multiagent Systems With Uncertainties , 2010, IEEE Transactions on Neural Networks.

[54]  Derong Liu,et al.  Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems , 2016, IEEE Transactions on Cybernetics.

[55]  Huaguang Zhang,et al.  Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.

[56]  Dongbin Zhao,et al.  Model-Free Optimal Control for Affine Nonlinear Systems With Convergence Analysis , 2015, IEEE Transactions on Automation Science and Engineering.

[57]  Richard M. Murray,et al.  INFORMATION FLOW AND COOPERATIVE CONTROL OF VEHICLE FORMATIONS , 2002 .

[58]  Tingwen Huang,et al.  Off-Policy Reinforcement Learning for $ H_\infty $ Control Design , 2013, IEEE Transactions on Cybernetics.

[59]  Jie Huang,et al.  The Leader-Following Consensus for Multiple Uncertain Euler-Lagrange Systems With an Adaptive Distributed Observer , 2016, IEEE Transactions on Automatic Control.