Optimal and Stable Control for Two-Player Zero-Sum Game Using Adaptive Dynamic Programming

In this paper, an optimal and stable iteration learning scheme is developed for two-player zero-sum game (ZSG) in the discrete-time nonlinear systems. In the implementation of developed algorithm, the optimal and stable solution of Hamilton-Jacobi-Isaacs can be obtained based on adaptive dynamic programming (ADP). First, in order to obtain the optimal control policies, value iteration approach is employed with the proof of convergence given. Second, based on Lyapunov theory, an easy condition is proposed to get the stable control laws. Third, neural networks are used to implement the developed algorithm. Finally, a simulation example is included to verify the present method.

[1]  Dimitri P. Bertsekas,et al.  Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Qinglai Wei,et al.  Neural-network-based synchronous iteration learning method for multi-player zero-sum games , 2017, Neurocomputing.

[3]  Derong Liu,et al.  Adaptive Dynamic Programming for Discrete-Time Zero-Sum Games , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Bo Lincoln,et al.  Relaxing dynamic programming , 2006, IEEE Transactions on Automatic Control.

[5]  P.J. Werbos,et al.  Using ADP to Understand and Replicate Brain Intelligence: the Next Level Design , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[6]  Frank L. Lewis,et al.  Optimal Control: Lewis/Optimal Control 3e , 2012 .

[7]  Haibo He,et al.  Event-Driven Nonlinear Discounted Optimal Regulation Involving a Power System Application , 2017, IEEE Transactions on Industrial Electronics.

[8]  Derong Liu,et al.  Neural-network-based zero-sum game for discrete-time nonlinear systems via iterative adaptive dynamic programming algorithm , 2013, Neurocomputing.

[9]  Haibo He,et al.  Model-Free Adaptive Control for Unknown Nonlinear Zero-Sum Differential Game , 2018, IEEE Transactions on Cybernetics.

[10]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[11]  Kun Zhang,et al.  Iterative adaptive dynamic programming methods with neural network implementation for multi-player zero-sum games , 2018, Neurocomputing.

[12]  Derong Liu,et al.  Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems , 2016, IEEE Transactions on Cybernetics.

[13]  Haibo He,et al.  Data-Driven Finite-Horizon Approximate Optimal Control for Discrete-Time Nonlinear Systems Using Iterative HDP Approach , 2018, IEEE Transactions on Cybernetics.

[14]  Huaguang Zhang,et al.  Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.

[15]  Stef Tijs,et al.  Introduction to Game Theory , 2003 .