GrHDP Solution for Optimal Consensus Control of Multiagent Discrete-Time Systems

This paper develops a new online learning consensus control scheme for multiagent discrete-time systems by goal representation heuristic dynamic programming (GrHDP) techniques. The agents in the whole system are interacted with each other through a communication graph structure. Therefore, each agent can only receive the information from itself and its neighbors. Our goal is to design the GrHDP method to achieve consensus control which makes all the agents track the desired dynamics and simultaneously makes the performance indices reach Nash equilibrium. The new local internal reinforcement signals and local performance indices are provided for each agent and the corresponding distributed control laws are designed. Then, GrHDP algorithm is developed to solve the multiagent consensus control problem with the proof of convergence. It is shown that the designed local internal reinforcement signals are bounded signals and the local performance indices can monotonically converge to their optimal values. Moreover, the desired distributed control laws can also achieve optimal. Two simulation studies, including one with four agents and another with ten agents, are applied to validate the theoretical analysis and also demonstrate the effectiveness of the proposed method.

[1]  Haibo He,et al.  A Theoretical Foundation of Goal Representation Heuristic Dynamic Programming , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Yufei Tang,et al.  SMES-Based Damping Controller Design Using Fuzzy-GrHDP Considering Transmission Delay , 2016, IEEE Transactions on Applied Superconductivity.

[3]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[4]  Frank L. Lewis,et al.  Approximate dynamic programming solutions of multi-agent graphical games using actor-critic network structures , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[5]  Xinghuo Yu,et al.  Flocking of Multi-Agent Non-Holonomic Systems With Proximity Graphs , 2013, IEEE Transactions on Circuits and Systems I: Regular Papers.

[6]  Xiong Luo,et al.  An integrated design for intensified direct heuristic dynamic programming , 2013, 2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[7]  C. L. Philip Chen,et al.  Fuzzy Observed-Based Adaptive Consensus Tracking Control for Second-Order Multiagent Systems With Heterogeneous Nonlinear Dynamics , 2016, IEEE Transactions on Fuzzy Systems.

[8]  Stephen P. Boyd,et al.  A scheme for robust distributed sensor fusion based on average consensus , 2005, IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005..

[9]  Derong Liu,et al.  Neural-Network-Based Optimal Control for a Class of Unknown Discrete-Time Nonlinear Systems Using Globalized Dual Heuristic Programming , 2012, IEEE Transactions on Automation Science and Engineering.

[10]  Derong Liu,et al.  Finite-Approximation-Error-Based Optimal Control Approach for Discrete-Time Nonlinear Systems , 2013, IEEE Transactions on Cybernetics.

[11]  Huaguang Zhang,et al.  Data-Driven Optimal Consensus Control for Discrete-Time Multi-Agent Systems With Unknown Dynamics Using Reinforcement Learning Method , 2017, IEEE Transactions on Industrial Electronics.

[12]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[13]  Huaguang Zhang,et al.  Leader-Based Optimal Coordination Control for the Consensus Problem of Multiagent Differential Games via Fuzzy Adaptive Dynamic Programming , 2015, IEEE Transactions on Fuzzy Systems.

[14]  Darsana P. Josyula,et al.  An Overview of Data Privacy in Multi-Agent Learning Systems , 2013 .

[15]  Haibo He,et al.  Gr-GDHP: A New Architecture for Globalized Dual Heuristic Dynamic Programming , 2017, IEEE Transactions on Cybernetics.

[16]  Youxian Sun,et al.  Distributed Control of Nonlinear Multiagent Systems With Asymptotic Consensus , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[17]  Guoqiang Hu,et al.  Distributed Secure Coordinated Control for Multiagent Systems Under Strategic Attacks , 2017, IEEE Transactions on Cybernetics.

[18]  Haibo He,et al.  Model-Free Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Yi Zhang,et al.  A self-learning call admission control scheme for CDMA cellular networks , 2005, IEEE Transactions on Neural Networks.

[20]  Milind Tambe,et al.  Distributed Sensor Networks: A Multiagent Perspective , 2003 .

[21]  Chia-Feng Juang,et al.  Ant Colony Optimization Incorporated With Fuzzy Q-Learning for Reinforcement Fuzzy Control , 2009, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[22]  Haibo He,et al.  Adaptive Learning and Control for MIMO System Based on Adaptive Dynamic Programming , 2011, IEEE Transactions on Neural Networks.

[23]  Jinyu Wen,et al.  Adaptive Learning in Tracking Control Based on the Dual Critic Network Design , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[24]  E.M. Atkins,et al.  A survey of consensus problems in multi-agent coordination , 2005, Proceedings of the 2005, American Control Conference, 2005..

[25]  Paul J. Werbos,et al.  2009 Special Issue: Intelligence in the brain: A theory of how it works and how to build it , 2009 .

[26]  Jing Chen,et al.  A Novel Adaptive Tropism Reward ADHDP Method with Robust Property , 2013, BICS.

[27]  Wenxin Liu,et al.  Multiagent System-Based Integrated Solution for Topology Identification and State Estimation , 2017, IEEE Transactions on Industrial Informatics.

[28]  Haibo He,et al.  Robust controller design of continuous-time nonlinear system using neural network , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[29]  Haibo He,et al.  Heuristic dynamic programming with internal goal representation , 2013, Soft Comput..

[30]  Wenwu Yu,et al.  An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination , 2012, IEEE Transactions on Industrial Informatics.

[31]  Frank L. Lewis,et al.  Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games , 2015, Inf. Sci..

[32]  Derong Liu,et al.  Optimal Control of Unknown Nonlinear Discrete-Time Systems Using the Iterative Globalized Dual Heuristic Programming Algorithm , 2013 .

[33]  Guo-Ping Liu,et al.  Consensus and Stability Analysis of Networked Multiagent Predictive Control Systems , 2017, IEEE Transactions on Cybernetics.

[34]  Yan-Jun Liu,et al.  Neural Network-Based Adaptive Leader-Following Consensus Control for a Class of Nonlinear Multiagent State-Delay Systems , 2017, IEEE Transactions on Cybernetics.

[35]  Jennie Si,et al.  Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.

[36]  Guo-Xing Wen,et al.  Adaptive Consensus Control for a Class of Nonlinear Multiagent Time-Delay Systems Using Neural Networks , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Paul J. Werbos,et al.  Foreword: ADP - The Key Direction for Future Research in Intelligent Control and Understanding Brain Intelligence , 2008, IEEE Trans. Syst. Man Cybern. Part B.

[38]  J. Such,et al.  A survey of privacy in multi-agent systems , 2013, The Knowledge Engineering Review.

[39]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[40]  Tingwen Huang,et al.  Cooperative Distributed Optimization in Multiagent Networks With Delays , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[41]  Derong Liu,et al.  Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[42]  Feng Liu,et al.  A boundedness result for the direct heuristic dynamic programming , 2012, Neural Networks.

[43]  Haibo He,et al.  GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[44]  Haibo He,et al.  Data-driven partially observable dynamic processes using adaptive dynamic programming , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[45]  Richard M. Murray,et al.  Consensus problems in networks of agents with switching topology and time-delays , 2004, IEEE Transactions on Automatic Control.

[46]  Haibo He Self-Adaptive Systems for Machine Intelligence , 2011 .

[47]  Frank L. Lewis,et al.  Multi-agent discrete-time graphical games and reinforcement learning solutions , 2014, Autom..

[48]  Haibo He,et al.  A three-network architecture for on-line learning and optimization based on adaptive dynamic programming , 2012, Neurocomputing.

[49]  Haibo He,et al.  Goal Representation Heuristic Dynamic Programming on Maze Navigation , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[50]  Shaocheng Tong,et al.  A Unified Approach to Adaptive Neural Control for Nonlinear Discrete-Time Systems With Nonlinear Dead-Zone Input , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[51]  B.D.O. Anderson,et al.  The multi-agent rendezvous problem - the asynchronous case , 2004, 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601).