Zero-sum two-player game theoretic formulation of affine nonlinear discrete-time systems using neural networks

In this paper, the nearly optimal solution for discrete-time (DT) affine nonlinear control systems in the presence of partially unknown internal system dynamics and disturbances is considered. The approach is based on successive approximate solution of the Hamilton-Jacobi-Isaacs (HJI) equation, which appears in optimal control. Successive approximation approach for updating control input and disturbance for DT nonlinear affine systems are proposed. Moreover, sufficient conditions for the convergence of the approximate HJI solution to the saddle-point are derived, and an iterative approach to approximate the HJI equation using a neural network (NN) is presented. Then, the requirement of full knowledge of the internal dynamics of the nonlinear DT system is relaxed by using a second NN online approximator. The result is a closed-loop optimal NN controller via offline learning. Numerical example is provided illustrating the effectiveness of the approach.

[1]  Marcus Johnson,et al.  Nonlinear two-player zero-sum game approximate solution using a Policy Iteration algorithm , 2011, IEEE Conference on Decision and Control and European Control Conference.

[2]  Frank L. Lewis,et al.  Policy Iterations on the Hamilton–Jacobi–Isaacs Equation for $H_{\infty}$ State Feedback Control With Input Saturation , 2006, IEEE Transactions on Automatic Control.

[3]  Jie Huang An algorithm to solve the discrete HJI equation arising in the L2 gain optimization problem , 1999 .

[4]  P. Khargonekar,et al.  State-space solutions to standard H2 and H∞ control problems , 1988, 1988 American Control Conference.

[5]  A. Schaft L/sub 2/-gain analysis of nonlinear systems and nonlinear state-feedback H/sub infinity / control , 1992 .

[6]  C. Byrnes,et al.  H∞-control of discrete-time nonlinear systems , 1996, IEEE Trans. Autom. Control..

[7]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[8]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Alberto Tesi,et al.  Global H/sub /spl infin// controllers for a class of nonlinear systems , 2004, IEEE Transactions on Automatic Control.

[10]  Van,et al.  L2-Gain Analysis of Nonlinear Systems and Nonlinear State Feedback H∞ Control , 2004 .

[11]  Brian D. O. Anderson,et al.  A Game Theoretic Algorithm to Solve Riccati and Hamilton—Jacobi—Bellman—Isaacs (HJBI) Equations in H ∞ Control , 2010 .

[12]  W. Ames The Method of Weighted Residuals and Variational Principles. By B. A. Finlayson. Academic Press, 1972. 412 pp. $22.50. , 1973, Journal of Fluid Mechanics.

[13]  D.L. Elliott,et al.  Feedback systems: Input-output properties , 1976, Proceedings of the IEEE.

[14]  J. William Helton,et al.  NonlinearH∞ control theory for stable plants , 1992, Math. Control. Signals Syst..

[15]  Frank L. Lewis,et al.  Online solution of nonlinear two-player zero-sum games using synchronous policy iteration , 2010, 49th IEEE Conference on Decision and Control (CDC).

[16]  J. Willems Dissipative dynamical systems part I: General theory , 1972 .

[17]  P. Khargonekar,et al.  State-space solutions to standard H/sub 2/ and H/sub infinity / control problems , 1989 .

[18]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[19]  Jagannathan Sarangapani,et al.  Neural Network Control of Nonlinear Discrete-Time Systems , 2018 .

[20]  Frank L. Lewis,et al.  Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control , 2007, Autom..

[21]  Jie Huang,et al.  Numerical approach to computing nonlinear H-infinity control laws , 1995 .

[22]  S. Monaco,et al.  OnH∞-control of discrete-time nonlinear systems , 1996 .

[23]  Frank L. Lewis,et al.  Model-free Approximate Dynamic Programming Schemes for Linear Systems , 2007, 2007 International Joint Conference on Neural Networks.

[24]  Tamer Başar,et al.  H1-Optimal Control and Related Minimax Design Problems , 1995 .

[25]  Frank L. Lewis,et al.  Discrete-time control algorithms and adaptive intelligent systems designs , 2007 .

[26]  Randal W. Bea Successive Galerkin approximation algorithms for nonlinear optimal and robust control , 1998 .