Differential-game for resource aware approximate optimal control of large-scale nonlinear systems with multiple players

In this paper, we propose a novel differential-game based neural network (NN) control architecture to solve an optimal control problem for a class of large-scale nonlinear systems involving N-players. We focus on optimizing the usage of the computational resources along with the system performance simultaneously. In particular, the N-players' control policies are desired to be designed such that they cooperatively optimize the large-scale system performance, and the sampling intervals for each player are desired to reduce the frequency of feedback execution. To develop a unified design framework that achieves both these objectives, we propose an optimal control problem by integrating both the design requirements, which leads to a multi-player differential-game. A solution to this problem is numerically obtained by solving the associated Hamilton-Jacobi (HJ) equation using event-driven approximate dynamic programming (E-ADP) and artificial NNs online and forward-in-time. We employ the critic neural networks to approximate the solution to the HJ equation, i.e., the optimal value function, with aperiodically available feedback information. Using the NN approximated value function, we design the control policies and the sampling schemes. Finally, the event-driven N-player system is remodeled as a hybrid dynamical system with impulsive weight update rules for analyzing its stability and convergence properties. The closed-loop practical stability of the system and Zeno free behavior of the sampling scheme are demonstrated using the Lyapunov method. Simulation results using a numerical example are also included to substantiate the analytical results.

[1]  A. Friedman Differential games , 1971 .

[2]  P.J. Werbos,et al.  Using ADP to Understand and Replicate Brain Intelligence: the Next Level Design , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[3]  Haibo He,et al.  Event-Driven Nonlinear Discounted Optimal Regulation Involving a Power System Application , 2017, IEEE Transactions on Industrial Electronics.

[4]  Derong Liu,et al.  Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games With Unknown Dynamics , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[5]  João P. Hespanha,et al.  Cooperative Q-Learning for Rejection of Persistent Adversarial Inputs in Networked Linear Quadratic Systems , 2018, IEEE Transactions on Automatic Control.

[6]  Xiaofeng Wang,et al.  Event-Triggering in Distributed Networked Control Systems , 2011, IEEE Transactions on Automatic Control.

[7]  R. Bellman Dynamic programming. , 1957, Science.

[8]  Derong Liu,et al.  Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Marcus Johnson,et al.  Approximate $N$ -Player Nonzero-Sum Game Solution for an Uncertain Continuous Nonlinear System , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Frank L. Lewis,et al.  Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Qichao Zhang,et al.  Experience Replay for Optimal Control of Nonzero-Sum Game Systems With Unknown Dynamics , 2016, IEEE Transactions on Cybernetics.

[12]  Avimanyu Sahoo,et al.  A Min–Max Approach to Event- and Self-Triggered Sampling and Regulation of Linear Systems , 2019, IEEE Transactions on Industrial Electronics.

[13]  J. Case,et al.  Toward a theory of many player differential games. , 1969 .

[14]  Tamer Başar,et al.  H1-Optimal Control and Related Minimax Design Problems , 1995 .

[15]  Frank L. Lewis,et al.  Game Theory-Based Control System Algorithms with Real-Time Reinforcement Learning: How to Solve Multiplayer Games Online , 2017, IEEE Control Systems.

[16]  Paulo Tabuada,et al.  Event-Triggered Real-Time Scheduling of Stabilizing Control Tasks , 2007, IEEE Transactions on Automatic Control.

[17]  Warren E. Dixon,et al.  Model-based reinforcement learning for infinite-horizon approximate optimal tracking , 2014, 53rd IEEE Conference on Decision and Control.

[18]  Haibo He,et al.  Event-Triggered Adaptive Dynamic Programming for Continuous-Time Systems With Control Constraints , 2017, IEEE Trans. Neural Networks Learn. Syst..

[19]  Frank L. Lewis,et al.  Optimal Control , 1986 .

[20]  Tansel Yucelen,et al.  On Event-Triggered Adaptive Architectures for Decentralized and Distributed Control of Large-Scale Modular Systems , 2016, Sensors.

[21]  Avimanyu Sahoo,et al.  Approximate Optimal Control of Affine Nonlinear Continuous-Time Systems Using Event-Sampled Neurodynamic Programming , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[22]  Pavankumar Tallapragada,et al.  Decentralized Event-Triggering for Control of Nonlinear Systems , 2013, IEEE Transactions on Automatic Control.

[23]  Zhong-Ping Jiang,et al.  A Small-Gain Approach to Robust Event-Triggered Control of Nonlinear Systems , 2015, IEEE Transactions on Automatic Control.

[24]  Avimanyu Sahoo,et al.  Stochastic Optimal Regulation of Nonlinear Networked Control Systems by Using Event-Driven Adaptive Dynamic Programming , 2017, IEEE Transactions on Cybernetics.

[25]  Antoine Girard,et al.  Dynamic Triggering Mechanisms for Event-Triggered Control , 2013, IEEE Transactions on Automatic Control.

[26]  Sarangapani Jagannathan,et al.  Event-Sampled Output Feedback Control of Robot Manipulators Using Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Sarangapani Jagannathan,et al.  Event-Triggered Distributed Approximate Optimal State and Output Control of Affine Nonlinear Interconnected Systems , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Y. Ho,et al.  Nonzero-sum differential games , 1969 .

[29]  Frank L. Lewis,et al.  Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations , 2011, Autom..

[30]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[31]  Warren E. Dixon,et al.  Concurrent learning-based approximate feedback-Nash equilibrium solution of N-player nonzero-sum differential games , 2013, IEEE/CAA Journal of Automatica Sinica.

[32]  Ali Heydari,et al.  Stability Analysis of Optimal Adaptive Control Under Value Iteration Using a Stabilizing Initial Policy , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Derong Liu,et al.  Neural-Network-Based Distributed Adaptive Robust Control for a Class of Nonlinear Multiagent Systems With Time Delays and External Noises , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[34]  Avimanyu Sahoo,et al.  Neural Network-Based Event-Triggered State Feedback Control of Nonlinear Continuous-Time Systems , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[35]  Warren E. Dixon,et al.  Approximate optimal trajectory tracking for continuous-time nonlinear systems , 2013, Autom..

[36]  Frank L. Lewis,et al.  Neural Network Control Of Robot Manipulators And Non-Linear Systems , 1998 .

[37]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[38]  J. Nash NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.

[39]  Avimanyu Sahoo,et al.  Event-triggered Control of N-player Nonlinear Systems Using Nonzero-Sum Games , 2018, 2018 IEEE Symposium Series on Computational Intelligence (SSCI).