Approximately adaptive neural cooperative control for nonlinear multiagent systems with performance guarantee

ABSTRACT This paper studies the cooperative control problem for a class of multiagent dynamical systems with partially unknown nonlinear system dynamics. In particular, the control objective is to solve the state consensus problem for multiagent systems based on the minimisation of certain cost functions for individual agents. Under the assumption that there exist admissible cooperative controls for such class of multiagent systems, the formulated problem is solved through finding the optimal cooperative control using the approximate dynamic programming and reinforcement learning approach. With the aid of neural network parameterisation and online adaptive learning, our method renders a practically implementable approximately adaptive neural cooperative control for multiagent systems. Specifically, based on the Bellman's principle of optimality, the Hamilton–Jacobi–Bellman (HJB) equation for multiagent systems is first derived. We then propose an approximately adaptive policy iteration algorithm for multiagent cooperative control based on neural network approximation of the value functions. The convergence of the proposed algorithm is rigorously proved using the contraction mapping method. The simulation results are included to validate the effectiveness of the proposed algorithm.

[1]  Jing Wang,et al.  Robust adaptive control of a class of nonlinearly parameterized time-varying uncertain systems , 2009, 2009 American Control Conference.

[2]  Jessica Daecher,et al.  Robust Control Of Nonlinear Uncertain Systems , 2016 .

[3]  Wenjie Dong Distributed observer-based cooperative control of multiple nonholonomic mobile agents , 2012, Int. J. Syst. Sci..

[4]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[5]  Tao Zhang,et al.  Stable Adaptive Neural Network Control , 2001, The Springer International Series on Asian Studies in Computer and Information Science.

[6]  Zhihua Qu,et al.  Robust adaptive control of a class of nonlinearly parameterised time-varying uncertain systems , 2009 .

[7]  Z. Qu,et al.  Cooperative Control of Dynamical Systems: Applications to Autonomous Vehicles , 2009 .

[8]  Frank L. Lewis,et al.  Multi-agent differential graphical games , 2011, Proceedings of the 30th Chinese Control Conference.

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Luc Moreau,et al.  Stability of multiagent systems with time-dependent communication links , 2005, IEEE Transactions on Automatic Control.

[11]  Manfredi Maggiore,et al.  State Agreement for Continuous-Time Coupled Nonlinear Systems , 2007, SIAM J. Control. Optim..

[12]  Randal W. Beard,et al.  Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..

[13]  Frank L. Lewis,et al.  Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2010, Autom..

[14]  Randal W. Beard,et al.  Consensus seeking in multiagent systems under dynamically changing interaction topologies , 2005, IEEE Transactions on Automatic Control.

[15]  Jing Wang,et al.  Approximate Policy Iteration for Cooperative Control of Multiagent Systems Under Limited Sensing/Communication , 2015 .

[16]  Zhihua Qu,et al.  Cooperative control of networked nonlinear systems , 2010, 49th IEEE Conference on Decision and Control (CDC).

[17]  Alexander Schwab Cooperative control of dynamical systems , 2017 .

[18]  Jennie Si,et al.  Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[19]  Robert M. Sanner,et al.  Gaussian Networks for Direct Adaptive Control , 1991, 1991 American Control Conference.

[20]  George J. Pappas,et al.  Flocking in Fixed and Switching Networks , 2007, IEEE Transactions on Automatic Control.

[21]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[22]  A. Jadbabaie,et al.  Effects of Delay in Multi-Agent Consensus and Oscillator Synchronization , 2010, IEEE Transactions on Automatic Control.

[23]  Zhihua Qu,et al.  A control-design-based solution to robotic ecology: Autonomy of achieving cooperative behavior from a high-level astronaut command , 2006, Auton. Robots.

[24]  Marios M. Polycarpou,et al.  High-order neural network structures for identification of dynamical systems , 1995, IEEE Trans. Neural Networks.

[25]  Zhihua Qu,et al.  A distributed cooperative steering control with application to nonholonomic robots , 2010, 49th IEEE Conference on Decision and Control (CDC).

[26]  Dimitri P. Bertsekas,et al.  Approximate Dynamic Programming , 2017, Encyclopedia of Machine Learning and Data Mining.

[27]  and Charles K. Taft Reswick,et al.  Introduction to Dynamic Systems , 1967 .

[28]  Mireille E. Broucke,et al.  Local control strategies for groups of mobile autonomous agents , 2004, IEEE Transactions on Automatic Control.

[29]  Jing Wang,et al.  Robust adaptive neural control for a class of perturbed strict feedback nonlinear systems , 2002, IEEE Trans. Neural Networks.

[30]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[31]  Richard M. Murray,et al.  Information flow and cooperative control of vehicle formations , 2004, IEEE Transactions on Automatic Control.

[32]  Jie Lin,et al.  Coordination of groups of mobile autonomous agents using nearest neighbor rules , 2003, IEEE Trans. Autom. Control..

[33]  P. B. Coaker,et al.  Applied Dynamic Programming , 1964 .

[34]  George N. Saridis,et al.  An Approximation Theory of Optimal Control for Trainable Manipulators , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[35]  Frank L. Lewis,et al.  2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .

[36]  Zhihua Qu,et al.  Discontinuous cooperative control for consensus of multiagent systems with switching topologies and time-delays , 2013, 52nd IEEE Conference on Decision and Control.

[37]  Richard M. Murray,et al.  Consensus problems in networks of agents with switching topology and time-delays , 2004, IEEE Transactions on Automatic Control.

[38]  Randal W. Beard,et al.  Distributed Consensus in Multi-vehicle Cooperative Control - Theory and Applications , 2007, Communications and Control Engineering.

[39]  B. Pasik-Duncan,et al.  Adaptive Control , 1996, IEEE Control Systems.

[40]  Zhihua Qu,et al.  Cooperative Control of Dynamical Systems With Application to Autonomous Vehicles , 2008, IEEE Transactions on Automatic Control.

[41]  Reza Olfati-Saber,et al.  Consensus and Cooperation in Networked Multi-Agent Systems , 2007, Proceedings of the IEEE.