Computationally efficient simultaneous policy update algorithm for nonlinear H∞ state feedback control with Galerkin's method

The main bottleneck for the application of H∞ control theory on practical nonlinear systems is the need to solve the Hamilton–Jacobi–Isaacs (HJI) equation. The HJI equation is a nonlinear partial differential equation (PDE) that has proven to be impossible to solve analytically, even the approximate solution is still difficult to obtain. In this paper, we propose a simultaneous policy update algorithm (SPUA), in which the nonlinear HJI equation is solved by iteratively solving a sequence of Lyapunov function equations that are linear PDEs. By constructing a fixed point equation, the convergence of the SPUA is established rigorously by proving that it is essentially a Newton's iteration method for finding the fixed point. Subsequently, a computationally efficient SPUA (CESPUA) based on Galerkin's method, is developed to solve Lyapunov function equations in each iterative step of SPUA. The CESPUA is simple for implementation because only one iterative loop is included. Through the simulation studies on three examples, the results demonstrate that the proposed CESPUA is valid and efficient. Copyright © 2012 John Wiley & Sons, Ltd.

[1]  L. Kantorovitch The method of successive approximation for functional equations , 1939 .

[2]  E. G. Al'brekht On the optimal stabilization of nonlinear systems , 1961 .

[3]  P. Brunovský On optimal stabilization of nonlinear systems , 1967 .

[4]  Richard A. Tapia,et al.  The Kantorovich Theorem for Newton's Method , 1971 .

[5]  L. L. Lynn,et al.  The method of weighted residuals and variational principles, Bruce A. Finlayson, Academic Press, New York (1972). 412 pages , 1973 .

[6]  L. B. Rall A Note on the Convergence of Newton’s Method , 1974 .

[7]  George N. Saridis,et al.  An Approximation Theory of Optimal Control for Trainable Manipulators , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[8]  A. Isidori,et al.  Disturbance attenuation and H/sub infinity /-control via measurement feedback in nonlinear systems , 1992 .

[9]  A. Schaft L/sub 2/-gain analysis of nonlinear systems and nonlinear state-feedback H/sub infinity / control , 1992 .

[10]  A. Isidori,et al.  H∞ control via measurement feedback for general nonlinear systems , 1995, IEEE Trans. Autom. Control..

[11]  A. Schaft L2-Gain and Passivity Techniques in Nonlinear Control. Lecture Notes in Control and Information Sciences 218 , 1996 .

[12]  J. Doyle,et al.  Robust and optimal control , 1995, Proceedings of 35th IEEE Conference on Decision and Control.

[13]  T. Basar,et al.  H∞-0ptimal Control and Related Minimax Design Problems: A Dynamic Game Approach , 1996, IEEE Trans. Autom. Control..

[14]  Randal W. Beard,et al.  Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation , 1997, Autom..

[15]  G. Saridis,et al.  Approximate Solutions to the Time-Invariant Hamilton–Jacobi–Bellman Equation , 1998 .

[16]  M. Corless,et al.  An ℒ2 disturbance attenuation solution to the nonlinear benchmark problem , 1998 .

[17]  Randal W. Bea Successive Galerkin approximation algorithms for nonlinear optimal and robust control , 1998 .

[18]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[19]  Arthur J. Krener,et al.  Solution of Hamilton Jacobi Bellman equations , 2000, Proceedings of the 39th IEEE Conference on Decision and Control (Cat. No.00CH37187).

[20]  Alberto Tesi,et al.  Global H/sub /spl infin// controllers for a class of nonlinear systems , 2004, IEEE Transactions on Automatic Control.

[21]  Frank L. Lewis,et al.  Policy Iterations on the Hamilton–Jacobi–Isaacs Equation for $H_{\infty}$ State Feedback Control With Input Saturation , 2006, IEEE Transactions on Automatic Control.

[22]  Frank L. Lewis,et al.  Neural network solution for finite-horizon H-infinity constrained optimal control of nonlinear systems , 2007 .

[23]  Frank L. Lewis,et al.  Neurodynamic Programming and Zero-Sum Games for Constrained Control Systems , 2008, IEEE Transactions on Neural Networks.

[24]  Brian D. O. Anderson,et al.  Computing the Positive Stabilizing Solution to Algebraic Riccati Equations With an Indefinite Quadratic Term via a Recursive Method , 2008, IEEE Transactions on Automatic Control.

[25]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[26]  Brian D. O. Anderson,et al.  A game theoretic algorithm to compute local stabilizing solutions to HJBI equations in nonlinear H∞ control , 2009, Autom..

[27]  Frank L. Lewis,et al.  2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .

[28]  Frank L. Lewis,et al.  Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations , 2011, Autom..

[29]  Frank L. Lewis,et al.  Adaptive dynamic programming for online solution of a zero-sum differential game , 2011 .

[30]  Anders Helmersson,et al.  A Quasi-Newton Interior Point Method for Low Order H-Infinity Controller Synthesis , 2011, IEEE Transactions on Automatic Control.

[31]  Huaguang Zhang,et al.  An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games , 2011, Autom..

[32]  Mi-Ching Tsai,et al.  Robust and Optimal Control , 2014 .