Online reinforcement learning control of unknown nonaffine nonlinear discrete time systems

In this paper, a novel neural network (NN) based online reinforcement learning controller is designed for nonaffine nonlinear discrete-time systems with bounded disturbances. The nonaffine systems are represented by nonlinear auto regressive moving average with exogenous input (NARMAX) model with unknown nonlinear functions. An equivalent affine-like representation for the tracking error dynamics is developed first from the original nonaffine system. Subsequently, a reinforcement learning-based neural network (NN) controller is proposed for the affine-like nonlinear error dynamic system. The control scheme consists of two NNs. One NN is designated as the critic, which approximates a predefined long-term cost function, whereas an action NN is employed to derive a control signal for the system to track a desired trajectory while minimizing the cost function simultaneously. Offline NN training is not required and online NN weight tuning rules are derived. By using the standard Lyapunov approach, the uniformly ultimate boundedness (UUB) of the tracking error and weight estimates is demonstrated.

[1]  Jagannathan Sarangapani,et al.  Neural Network Control of Nonlinear Discrete-Time Systems , 2018 .

[2]  Lee H. Keel,et al.  A new method for the control of discrete nonlinear dynamic systems using neural networks , 2000, IEEE Trans. Neural Networks Learn. Syst..

[3]  Yoh-Han Pao,et al.  Stochastic choice of basis functions in adaptive function approximation and the functional-link net , 1995, IEEE Trans. Neural Networks.

[4]  M. S. Ahmed Neural-net-based direct adaptive control for a class of nonlinear plants , 2000, IEEE Trans. Autom. Control..

[5]  Lee H. Keel,et al.  Robust adaptive control of nonaffine nonlinear plants with small input signal changes , 2004, IEEE Transactions on Neural Networks.

[6]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[7]  Jennie Si,et al.  Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence) , 2004 .

[8]  Paul J. Werbos,et al.  Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.

[9]  Jennie Si,et al.  Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[10]  Radoslaw Romuald Zakrzewski,et al.  Neural network control of nonlinear discrete time systems , 1994 .

[11]  Qinmin Yang,et al.  Online Reinforcement Learning-based Neural Network Controller Design for Affine Nonlinear Discrete-time Systems , 2007, 2007 American Control Conference.

[12]  S. Billings,et al.  A prediction-error and stepwise-regression estimation algorithm for non-linear systems , 1986 .