Model-Free Distributed Consensus Control Based on Actor–Critic Framework for Discrete-Time Nonlinear Multiagent Systems

Conventionally, as the system’s dynamics is known, the optimal consensus control problem relies on solving the coupled Hamilton–Jacobi–Bellman (HJB) equations. In this paper, with the unknown system dynamics being considered, a local <inline-formula> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula>-function-based adaptive dynamic programming method is put forward to deal with the optimal consensus control problem for unknown discrete-time nonlinear multiagent systems by approximating the solutions of the coupled HJB equations. First, a local <inline-formula> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula>-function is defined, which considers the local consensus error and the actions of the agent and its neighbors. Using the <inline-formula> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula>-function, it is convenient to get the derivatives with regard to the weights of the consensus control policies, even without the model of system dynamics. Then, with the defined local <inline-formula> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula>-function, a distributed policy iteration technique is developed, which is theoretically proved to be convergent to the solutions of the coupled HJB equations. An actor–critic neural network framework for implementing the developed model-free optimal consensus control method is constructed to approximate the local <inline-formula> <tex-math notation="LaTeX">${Q}$ </tex-math></inline-formula>-functions and the control policies. Finally, the feasibility and effectiveness of the developed method are verified by a series of simulations.

[1]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[2]  Richard M. Murray,et al.  Information flow and cooperative control of vehicle formations , 2004, IEEE Transactions on Automatic Control.

[3]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4]  Zhihong Man,et al.  Robust Finite-Time Consensus Tracking Algorithm for Multirobot Systems , 2009, IEEE/ASME Transactions on Mechatronics.

[5]  Long Wang,et al.  Finite-time formation control for multi-agent systems , 2009, Autom..

[6]  Jiming Chen,et al.  Distributed Collaborative Control for Industrial Automation With Wireless Sensor and Actuator Networks , 2010, IEEE Transactions on Industrial Electronics.

[7]  Frank L. Lewis,et al.  Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations , 2011, Autom..

[8]  Frank L. Lewis,et al.  Multi-agent differential graphical games , 2011, Proceedings of the 30th Chinese Control Conference.

[9]  Robert Babuska,et al.  Experience Replay for Real-Time Reinforcement Learning Control , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[10]  Warren E. Dixon,et al.  Approximate optimal cooperative decentralized control for consensus in a topological network of agents with uncertain nonlinear dynamics , 2013, 2013 American Control Conference.

[11]  Frank L. Lewis,et al.  Cooperative Control of Multi-Agent Systems: Optimal and Adaptive Design Approaches , 2013 .

[12]  Xinghuo Yu,et al.  Flocking of Multi-Agent Non-Holonomic Systems With Proximity Graphs , 2013, IEEE Transactions on Circuits and Systems I: Regular Papers.

[13]  Bruce Bueno de Mesquita,et al.  An Introduction to Game Theory , 2014 .

[14]  Frank L. Lewis,et al.  Cooperative Optimal Control for Multi-Agent Systems on Directed Graph Topologies , 2014, IEEE Transactions on Automatic Control.

[15]  Frank L. Lewis,et al.  Multi-agent discrete-time graphical games and reinforcement learning solutions , 2014, Autom..

[16]  Dongbin Zhao,et al.  Model-Free Optimal Control for Affine Nonlinear Systems With Convergence Analysis , 2015, IEEE Transactions on Automation Science and Engineering.

[17]  Derong Liu,et al.  Multibattery Optimal Coordination Control for Home Energy Management Systems via Distributed Iterative Adaptive Dynamic Programming , 2015, IEEE Transactions on Industrial Electronics.

[18]  Huaguang Zhang,et al.  Leader-Based Optimal Coordination Control for the Consensus Problem of Multiagent Differential Games via Fuzzy Adaptive Dynamic Programming , 2015, IEEE Transactions on Fuzzy Systems.

[19]  Frank L. Lewis,et al.  Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games , 2015, Inf. Sci..

[20]  Yonghua Xiong,et al.  Two-Phase Iteration for Value Function Approximation and Hyperparameter Optimization in Gaussian-Kernel-Based Adaptive Critic Design , 2015 .

[21]  Derong Liu,et al.  An Approximate Optimal Control Approach for Robust Stabilization of a Class of Discrete-Time Nonlinear Systems With Uncertainties , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[22]  Peng Shi,et al.  Fuzzy Adaptive Control Design and Discretization for a Class of Nonlinear Uncertain Systems , 2016, IEEE Transactions on Cybernetics.

[23]  Huaguang Zhang,et al.  Data-Driven Optimal Consensus Control for Discrete-Time Multi-Agent Systems With Unknown Dynamics Using Reinforcement Learning Method , 2017, IEEE Transactions on Industrial Electronics.

[24]  Frank L. Lewis,et al.  Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Huai-Ning Wu,et al.  Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control , 2017, IEEE Transactions on Cybernetics.

[26]  K. Vamvoudakis Q‐learning for continuous‐time graphical games on large networks with completely unknown linear system dynamics , 2017 .

[27]  Huanqing Wang,et al.  Backstepping-Based Lyapunov Function Construction Using Approximate Dynamic Programming and Sum of Square Techniques , 2017, IEEE Transactions on Cybernetics.

[28]  Xin Chen,et al.  Finite‐Time Consensus Problem for Second‐Order Multi‐Agent Systems Under Switching Topologies , 2017 .

[29]  Kyriakos G. Vamvoudakis,et al.  Distributed learning algorithm for non-linear differential graphical games , 2017 .

[30]  Youxian Sun,et al.  Distributed Control of Nonlinear Multiagent Systems With Asymptotic Consensus , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[31]  Jie Huang,et al.  Adaptive leader-following consensus for a class of higher-order nonlinear multi-agent systems with directed switching networks , 2016, Autom..

[32]  Tao Feng,et al.  Distributed Optimal Consensus Control for Nonlinear Multiagent System With Unknown Dynamic , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Frank L. Lewis,et al.  Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Convergence Analysis , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[34]  Muhammad Rehan,et al.  Distributed Consensus Control of One-Sided Lipschitz Nonlinear Multiagent Systems , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[35]  Hongmin Li,et al.  Fuzzy-Approximation-Based Adaptive Output-Feedback Control for Uncertain Nonsmooth Nonlinear Systems , 2018, IEEE Transactions on Fuzzy Systems.

[36]  Wei Wang,et al.  Model-free optimal containment control of multi-agent systems based on actor-critic framework , 2018, Neurocomputing.

[37]  Peter Xiaoping Liu,et al.  Adaptive Neural Control of Nonlinear Systems With Unknown Control Directions and Input Dead-Zone , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[38]  Peter Xiaoping Liu,et al.  Adaptive Neural Output-Feedback Control for a Class of Nonlower Triangular Nonlinear Systems With Unmodeled Dynamics , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[39]  Xuhui Bu,et al.  Model Free Adaptive Iterative Learning Consensus Tracking Control for a Class of Nonlinear Multiagent Systems , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[40]  Xiangnan Zhong,et al.  GrHDP Solution for Optimal Consensus Control of Multiagent Discrete-Time Systems , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[41]  Zhi Liu,et al.  Leader-Following Consensus for a Class of Nonlinear Strick-Feedback Multiagent Systems With State Time-Delays , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.