Optimal Tracking Control of Heterogeneous Multi-agent Systems with Switching Topology Via Actor-Critic Neural Networks

In this paper, an optimal tracking control problem is solved for high-order heterogeneous multi-agent systems with time-varying interaction networks within the framework of reinforcement learning. First, the optimal tracking control problem is formulated as a leader-follower multi-agent system. Second, a policy iteration based adaptive dynamic programming (ADP) algorithm is proposed to compute the performance index and the control law. Furthermore, the convergence to the optimal solutions is analyzed for the proposed algorithm. Third, an actor-critic neural network is applied to approximate the iterative performance index function and the control law, which implement the policy iteration algorithm online without using the knowledge of the system dynamics. Finally, some simulation results are presented to demonstrate the proposed optimal tracking control strategy.

[1]  Zhengtao Ding,et al.  Distributed Agent Consensus-Based Optimal Resource Management for Microgrids , 2018, IEEE Transactions on Sustainable Energy.

[2]  Riccardo Scattolini,et al.  Model Predictive Control Schemes for Consensus in Multi-Agent Systems with Single- and Double-Integrator Dynamics , 2009, IEEE Transactions on Automatic Control.

[3]  Huaguang Zhang,et al.  Distributed Cooperative Optimal Control for Multiagent Systems on Directed Graphs: An Inverse Optimal Approach , 2015, IEEE Transactions on Cybernetics.

[4]  Jiangping Hu,et al.  Distributed tracking control of leader-follower multi-agent systems under noisy measurement , 2011, Autom..

[5]  Xiaoming Hu,et al.  Cooperative shift estimation of target trajectory using clustered sensors , 2014, Journal of Systems Science and Complexity.

[6]  Xinghuo Yu,et al.  Distributed Optimal Consensus Over Resource Allocation Network and Its Application to Dynamical Economic Dispatch , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Frank L. Lewis,et al.  Multi-agent differential graphical games , 2011, Proceedings of the 30th Chinese Control Conference.

[8]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[10]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[11]  Huaguang Zhang,et al.  Data-Driven Optimal Consensus Control for Discrete-Time Multi-Agent Systems With Unknown Dynamics Using Reinforcement Learning Method , 2017, IEEE Transactions on Industrial Electronics.

[12]  Jennie Si,et al.  Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[13]  Frank L. Lewis,et al.  Multi-agent discrete-time graphical games: interactive Nash equilibrium and value iteration solution , 2013, 2013 American Control Conference.

[14]  Jiangping Hu,et al.  Tracking control for multi-agent consensus with an active leader and variable topology , 2006, Autom..

[15]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[16]  Derong Liu,et al.  Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Frank L. Lewis,et al.  Lyapunov, Adaptive, and Optimal Design Techniques for Cooperative Systems on Directed Communication Graphs , 2012, IEEE Transactions on Industrial Electronics.

[18]  Frank L. Lewis,et al.  Non-zero sum games: Online learning solution of coupled Hamilton-Jacobi and coupled Riccati equations , 2011, 2011 IEEE International Symposium on Intelligent Control.

[19]  Togar M. Simatupang,et al.  Multi-agent Reinforcement Learning for Collaborative Transportation Management (CTM) , 2017 .

[20]  Frank L. Lewis,et al.  Multi-agent discrete-time graphical games and reinforcement learning solutions , 2014, Autom..

[21]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[22]  Frank L. Lewis,et al.  Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games , 2015, Inf. Sci..

[23]  Qinglai Wei,et al.  Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..