Data-based optimal coordination control of continuous-time nonlinear multi-agent systems via adaptive dynamic programming method

Abstract This paper focuses on the optimal coordination control problem for continuous-time nonlinear multi-agent systems with completely unknown dynamics via a data-based distributed adaptive dynamic programming method. As for most real-world applications, accurate system models are complicated to obtain, which restricts the application of the conventional methods. Moreover, it is challenging to design optimal coordination control of multi-agent systems especially for the time-varying communication topology. To deal with the difficulties, we investigate a distributed adaptive dynamic programming method with identifier-critic architecture under the switching communication topology. First, using the available system data, an online adaptive identifier is developed to approximate the unknown model dynamics, and simultaneously a critic neural network is employed for approximation of the optimal cost function, which yields approximated optimal coordination control in real time. Then, we analyze the stability of our proposed scheme. Eventually, the simulation illustrates the effectiveness of the developed method.

[1]  V. Borkar,et al.  Asymptotic agreement in distributed estimation , 1982 .

[2]  James E. Steck,et al.  Adaptive Feedback Control by Constrained Approximate Dynamic Programming , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  S. Sastry,et al.  Adaptive Control: Stability, Convergence and Robustness , 1989 .

[4]  Huaguang Zhang,et al.  An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games , 2011, Autom..

[5]  Jing Na,et al.  Adaptive optimal tracking controls of unknown multi-input systems based on nonzero-sum game theory , 2019, J. Frankl. Inst..

[6]  Xiaohong Cui,et al.  Data-based approximate optimal control for nonzero-sum games of multi-player systems using adaptive dynamic programming , 2018, Neurocomputing.

[7]  M. Stone The Generalized Weierstrass Approximation Theorem , 1948 .

[8]  Tingwen Huang,et al.  Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design , 2014, Autom..

[9]  Changyin Sun,et al.  Q-learning solution for optimal consensus control of discrete-time multiagent systems using reinforcement learning , 2019, J. Frankl. Inst..

[10]  Frank L. Lewis,et al.  2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .

[11]  Kenji Doya,et al.  Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.

[12]  Long Wang,et al.  Asynchronous Consensus in Continuous-Time Multi-Agent Systems With Switching Topology and Time-Varying Delays , 2006, IEEE Transactions on Automatic Control.

[13]  Frank L. Lewis,et al.  A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems , 2013, Autom..

[14]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[15]  S. N. Balakrishnan,et al.  Adaptive-critic based neural networks for aircraft optimal control , 1996 .

[16]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[17]  Huaguang Zhang,et al.  Data-Driven Optimal Consensus Control for Discrete-Time Multi-Agent Systems With Unknown Dynamics Using Reinforcement Learning Method , 2017, IEEE Transactions on Industrial Electronics.

[18]  Guido Herrmann,et al.  Robust adaptive finite‐time parameter estimation and control for robotic systems , 2015 .

[19]  Richard M. Murray,et al.  Consensus problems in networks of agents with switching topology and time-delays , 2004, IEEE Transactions on Automatic Control.

[20]  Reza Olfati-Saber,et al.  Consensus and Cooperation in Networked Multi-Agent Systems , 2007, Proceedings of the IEEE.

[21]  E.M. Atkins,et al.  A survey of consensus problems in multi-agent coordination , 2005, Proceedings of the 2005, American Control Conference, 2005..

[22]  Yu Jiang,et al.  Robust Adaptive Dynamic Programming and Feedback Stabilization of Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Frank L. Lewis,et al.  Multi-agent differential graphical games , 2011, Proceedings of the 30th Chinese Control Conference.

[24]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[25]  Frank L. Lewis,et al.  Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2010, Autom..

[26]  Xuemei Ren,et al.  Approximate Nash Solutions for Multiplayer Mixed-Zero-Sum Game With Reinforcement Learning , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[27]  Huaguang Zhang,et al.  Leader-Based Optimal Coordination Control for the Consensus Problem of Multiagent Differential Games via Fuzzy Adaptive Dynamic Programming , 2015, IEEE Transactions on Fuzzy Systems.

[28]  Haibo He,et al.  Adaptive critic designs for optimal control of uncertain nonlinear systems with unmatched interconnections , 2018, Neural Networks.

[29]  Jianbin Qiu,et al.  Fuzzy Adaptive Finite-Time Fault-Tolerant Control for Strict-Feedback Nonlinear Systems , 2020, IEEE Transactions on Fuzzy Systems.

[30]  Tao Feng,et al.  Distributed Optimal Consensus Control for Nonlinear Multiagent System With Unknown Dynamic , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[32]  Yu Guo,et al.  Online adaptive optimal control for continuous-time nonlinear systems with completely unknown dynamics , 2016, Int. J. Control.

[33]  Haibo He,et al.  Reinforcement learning for robust adaptive control of partially unknown nonlinear systems subject to unmatched uncertainties , 2018, Inf. Sci..

[34]  Jing Na,et al.  Online Nash-optimization tracking control of multi-motor driven load system with simplified RL scheme. , 2019, ISA transactions.

[35]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[36]  Warren B. Powell,et al.  Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[37]  Daizhan Cheng,et al.  Leader-following consensus of multi-agent systems under fixed and switching topologies , 2010, Syst. Control. Lett..

[38]  Youxian Sun,et al.  Robust ADP Design for Continuous-Time Nonlinear Systems With Output Constraints , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[39]  Jinfeng Gao,et al.  Distributed adaptive event-triggered protocol for tracking control of leader-following multi-agent systems , 2019, J. Frankl. Inst..

[40]  Derong Liu,et al.  Adaptive Dynamic Programming for Discrete-Time Zero-Sum Games , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Rajesh Kumar,et al.  Multi agent system: concepts, platforms and applications in power systems , 2016, Artificial Intelligence Review.

[42]  Jianbin Qiu,et al.  Adaptive Fuzzy Control for Nontriangular Structural Stochastic Switched Nonlinear Systems With Full State Constraints , 2019, IEEE Transactions on Fuzzy Systems.