Off-policy integral reinforcement learning optimal tracking control for continuous-time chaotic systems

This paper estimates an off-policy integral reinforcement learning (IRL) algorithm to obtain the optimal tracking control of unknown chaotic systems. Off-policy IRL can learn the solution of the HJB equation from the system data generated by an arbitrary control. Moreover, off-policy IRL can be regarded as a direct learning method, which avoids the identification of system dynamics. In this paper, the performance index function is first given based on the system tracking error and control error. For solving the Hamilton-Jacobi-Bellman (HJB) equation, an off-policy IRL algorithm is proposed. It is proven that the iterative control makes the tracking error system asymptotically stable, and the iterative performance index function is convergent. Simulation study demonstrates the effectiveness of the developed tracking control method.

[1]  Derong Liu,et al.  Adaptive Dynamic Programming for Optimal Tracking Control of Unknown Nonlinear Systems With Application to Coal Gasification , 2014, IEEE Transactions on Automation Science and Engineering.

[2]  Guanrong Chen,et al.  Dynamical Analysis of a New Chaotic Attractor , 2002, Int. J. Bifurc. Chaos.

[3]  Ali Heydari,et al.  Finite-Horizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[5]  Jinhu Lu,et al.  Controlling uncertain Lü system using linear feedback , 2003 .

[6]  Derong Liu,et al.  Neural-network-based adaptive optimal tracking control scheme for discrete-time nonlinear systems with approximation errors , 2013, Neurocomputing.

[7]  Jinhu Lu,et al.  A New Chaotic Attractor Coined , 2002, Int. J. Bifurc. Chaos.

[8]  Guanrong Chen,et al.  YET ANOTHER CHAOTIC ATTRACTOR , 1999 .

[9]  Derong Liu,et al.  Numerical adaptive learning control scheme for discrete-time non-linear systems , 2013 .

[10]  Huaguang Zhang,et al.  Optimal control laws for time-delay systems with saturating actuators based on heuristic dynamic programming , 2010, Neurocomputing.

[11]  Tingwen Huang,et al.  Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design , 2014, Autom..

[12]  Hao Xu,et al.  Stochastic Optimal Controller Design for Uncertain Nonlinear Networked Control System via Neuro Dynamic Programming , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Huaguang Zhang,et al.  Multi-objective optimal control for a class of unknown nonlinear systems based on finite-approximation-error ADP algorithm , 2013, Neurocomputing.

[14]  Sarangapani Jagannathan,et al.  Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Zhang Huaguang,et al.  Exponential synchronization of stochastic impulsive perturbed chaotic Lur'e systems with time-varying delay and parametric uncertainty , 2008 .

[16]  E. Lorenz Deterministic nonperiodic flow , 1963 .

[17]  Huaguang Zhang,et al.  Optimal Tracking Control for a Class of Nonlinear Discrete-Time Systems With Time Delays Based on Heuristic Dynamic Programming , 2011, IEEE Transactions on Neural Networks.

[18]  L. Chua,et al.  The double scroll family , 1986 .

[19]  Haibo He,et al.  Online Learning Control Using Adaptive Critic Designs With Sparse Kernel Machines , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Guanrong Chen,et al.  The compound structure of a new chaotic attractor , 2002 .

[21]  Derong Liu,et al.  Neural-network-based optimal tracking control scheme for a class of unknown discrete-time nonlinear systems using iterative ADP algorithm , 2014, Neurocomputing.

[22]  Huaguang Zhang,et al.  An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games , 2011, Autom..

[23]  Changjin Xu,et al.  Bifurcation and control of chaos in a chemical system , 2015 .

[24]  Tingwen Huang,et al.  Off-Policy Reinforcement Learning for $ H_\infty $ Control Design , 2013, IEEE Transactions on Cybernetics.

[25]  Stephen Wiggins,et al.  Chaos in the quasiperiodically forced duffing oscillator , 1987 .