Adaptive Dynamics Programming for H∞ Control of Continuous-Time Unknown Nonlinear Systems via Generalized Fuzzy Hyperbolic Models

In this paper, a novel adaptive dynamic programming (ADP) algorithm is developed for the infinite-horizon (<inline-formula> <tex-math notation="LaTeX">$H_{\infty}$ </tex-math></inline-formula>) optimal control problems with unknown continuous-time (CT) nonlinear systems subject to external disturbances. To facilitate the implementation of the algorithm, generalized fuzzy hyperbolic models (GFHMs) are utilized to establish an identifier–critic architecture, where the identifier is designed to reconstruct the unknown system dynamics, and the GFHM-based critic network is employed to approximate the value functions. The CT <inline-formula> <tex-math notation="LaTeX">$H_{\infty}$ </tex-math></inline-formula> optimal control issue is converted into a two-player zero-sum game and the corresponding Hamilton–Jacobi–Isaacs equation is derived. The learning procedure of the critic design is adaptively implemented with the help of the reconstructed model, thus the requirement of the complete system dynamics is relaxed. Furthermore, by the means of Lyapunov direct method, the uniform ultimate boundedness stability analysis of the closed-loop control system is explicitly provided. Finally, to compare the control performances and disturbance attenuation properties of the proposed method and the existing ADP algorithms, two numerical examples are given.

[1]  Haibo He,et al.  Adaptive Critic Nonlinear Robust Control: A Survey , 2017, IEEE Transactions on Cybernetics.

[2]  Dana H. Ballard,et al.  Active Perception and Reinforcement Learning , 1990, Neural Computation.

[3]  Jing Na,et al.  Online H∞ control for completely unknown nonlinear systems via an identifier–critic-based ADP structure , 2019, Int. J. Control.

[4]  Derong Liu,et al.  Error Bound Analysis of $Q$ -Function for Discounted Optimal Control Problems With Policy Iteration , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[5]  Qinglai Wei,et al.  Multi-objective optimal control for a class of nonlinear time-delay systems via adaptive dynamic programming , 2013, Soft Comput..

[6]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7]  Rathinasamy Sakthivel,et al.  Modified Repetitive Control Design for Nonlinear Systems With Time Delay Based on T–S Fuzzy Model , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[8]  Chaomin Luo,et al.  Discrete-Time Nonzero-Sum Games for Multiplayer Using Policy-Iteration-Based Adaptive Dynamic Programming Algorithms , 2017, IEEE Transactions on Cybernetics.

[9]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[10]  Xin Zhang,et al.  Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method , 2011, IEEE Transactions on Neural Networks.

[11]  Frank L. Lewis,et al.  Mixed Iterative Adaptive Dynamic Programming for Optimal Battery Energy Control in Smart Residential Microgrids , 2017, IEEE Transactions on Industrial Electronics.

[12]  Haibo He,et al.  Event-Driven Nonlinear Discounted Optimal Regulation Involving a Power System Application , 2017, IEEE Transactions on Industrial Electronics.

[13]  R.J. Williams,et al.  Reinforcement learning is direct adaptive optimal control , 1991, IEEE Control Systems.

[14]  Frank L. Lewis,et al.  A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems , 2013, Autom..

[15]  Yu Liu,et al.  Optimal constrained self-learning battery sequential management in microgrid via adaptive dynamic programming , 2017, IEEE/CAA Journal of Automatica Sinica.

[16]  Derong Liu,et al.  Online Synchronous Approximate Optimal Learning Algorithm for Multi-Player Non-Zero-Sum Games With Unknown Dynamics , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[17]  Frank L. Lewis,et al.  Multiple Actor-Critic Structures for Continuous-Time Optimal Control Using Input-Output Data , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[18]  Huaguang Zhang,et al.  Adaptive Predefined Performance Control for MIMO Systems With Unknown Direction via Generalized Fuzzy Hyperbolic Model , 2017, IEEE Transactions on Fuzzy Systems.

[19]  Huaguang Zhang,et al.  Observer-based H∞ fuzzy control for modified repetitive control systems , 2018, Neurocomputing.

[20]  J. Na,et al.  Online adaptive approximate optimal tracking control with simplified dual approximation structure for continuous-time unknown nonlinear systems , 2014, IEEE/CAA Journal of Automatica Sinica.

[21]  Frank L. Lewis,et al.  Neurodynamic Programming and Zero-Sum Games for Constrained Control Systems , 2008, IEEE Transactions on Neural Networks.

[22]  Huaguang Zhang,et al.  Leader-Based Optimal Coordination Control for the Consensus Problem of Multiagent Differential Games via Fuzzy Adaptive Dynamic Programming , 2015, IEEE Transactions on Fuzzy Systems.

[23]  Hongjing Liang,et al.  Optimal control for nonlinear continuous systems by adaptive dynamic programming based on fuzzy basis functions , 2016 .

[24]  Nguyen Tan Luy,et al.  Adaptive dynamic programming-based design of integrated neural network structure for cooperative control of multiple MIMO nonlinear systems , 2017, Neurocomputing.

[25]  Dong Yue,et al.  Relaxed Control Design of Discrete-Time Takagi–Sugeno Fuzzy Systems: An Event-Triggered Real-Time Scheduling Approach , 2018, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[26]  Derong Liu,et al.  Learning and Guaranteed Cost Control With Event-Based Adaptive Critic Implementation , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[28]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[29]  Frank L. Lewis,et al.  Online solution of nonlinear two-player zero-sum games using synchronous policy iteration , 2010, 49th IEEE Conference on Decision and Control (CDC).

[30]  D. Liu,et al.  Adaptive Dynamic Programming for Finite-Horizon Optimal Control of Discrete-Time Nonlinear Systems With $\varepsilon$-Error Bound , 2011, IEEE Transactions on Neural Networks.

[31]  Tamer Başar,et al.  H1-Optimal Control and Related Minimax Design Problems , 1995 .

[32]  Tingwen Huang,et al.  Off-Policy Reinforcement Learning for $ H_\infty $ Control Design , 2013, IEEE Transactions on Cybernetics.

[33]  Shaocheng Tong,et al.  Fuzzy Adaptive Output Feedback Optimal Control Design for Strict-Feedback Nonlinear Systems , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[34]  Yushan Chen,et al.  An Approximate Dynamic Programming Approach to Multiagent Persistent Monitoring in Stochastic Environments With Temporal Logic Constraints , 2017, IEEE Transactions on Automatic Control.

[35]  Kun Zhang,et al.  Tracking control optimization scheme of continuous-time nonlinear system via online single network adaptive critic design method , 2017, Neurocomputing.

[36]  Marcus Johnson,et al.  Approximate $N$ -Player Nonzero-Sum Game Solution for an Uncertain Continuous Nonlinear System , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[37]  Frank L. Lewis,et al.  Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Tan Luy Nguyen,et al.  Adaptive dynamic programming-based design of integrated neural network structure for cooperative control of multiple MIMO nonlinear systems , 2017 .

[39]  Kun Zhang,et al.  Online reinforcement learning for a class of partially unknown continuous‐time nonlinear systems via value iteration , 2018 .

[40]  Haibo He,et al.  Data-Driven Tracking Control With Adaptive Dynamic Programming for a Class of Continuous-Time Nonlinear Systems , 2017, IEEE Transactions on Cybernetics.

[41]  Ali Karimpour,et al.  Approximate dynamic programming for two-player zero-sum game related to H∞ control of unknown nonlinear continuous-time systems , 2014, International Journal of Control, Automation and Systems.

[42]  Marilda Sotomayor Game Theory, Introduction to , 2009, Encyclopedia of Complexity and Systems Science.

[43]  Richard S. Sutton,et al.  A Menu of Designs for Reinforcement Learning Over Time , 1995 .

[44]  Qinglai Wei,et al.  Data-Driven Zero-Sum Neuro-Optimal Control for a Class of Continuous-Time Unknown Nonlinear Systems With Disturbance Using ADP , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[45]  Frank L. Lewis,et al.  Adaptive Critic Designs for Discrete-Time Zero-Sum Games With Application to $H_{\infty}$ Control , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[46]  Peihua Qiu,et al.  Fuzzy Modeling and Fuzzy Control , 2006, Technometrics.

[47]  Huaguang Zhang,et al.  Adaptive tracking control of uncertain MIMO nonlinear systems based on generalized fuzzy hyperbolic model , 2017, Fuzzy Sets Syst..

[48]  Frank L. Lewis,et al.  Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations , 2011, Autom..

[49]  Dong Yue,et al.  Relaxed Real-Time Scheduling Stabilization of Discrete-Time Takagi–Sugeno Fuzzy Systems via An Alterable-Weights-Based Ranking Switching Mechanism , 2018, IEEE Transactions on Fuzzy Systems.

[50]  Quan Yong Generalized Fuzzy Hyperbolic Model: a Universal Approximator , 2003 .

[51]  Huaguang Zhang,et al.  Fault-Tolerant Control of a Nonlinear System Based on Generalized Fuzzy Hyperbolic Model and Adaptive Disturbance Observer , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[52]  Derong Liu,et al.  Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems , 2016, IEEE Transactions on Cybernetics.

[53]  Dongbin Zhao,et al.  Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[54]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[55]  Derong Liu,et al.  Error Bounds of Adaptive Dynamic Programming Algorithms for Solving Undiscounted Optimal Control Problems , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[56]  F. Lewis,et al.  Online solution of nonquadratic two‐player zero‐sum games arising in the H ∞  control of constrained input systems , 2014 .

[57]  Frank L. Lewis,et al.  2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .

[58]  Haibo He,et al.  Intelligent Critic Control With Disturbance Attenuation for Affine Dynamics Including an Application to a Microgrid System , 2017, IEEE Transactions on Industrial Electronics.