Finite-horizon near optimal adaptive control of uncertain linear discrete-time systems

SUMMARY In this paper, the finite-horizon near optimal adaptive regulation of linear discrete-time systems with unknown system dynamics is presented in a forward-in-time manner by using adaptive dynamic programming and Q-learning. An adaptive estimator (AE) is introduced to relax the requirement of system dynamics, and it is tuned by using Q-learning. The time-varying solution to the Bellman equation in adaptive dynamic programming is handled by utilizing a time-dependent basis function, while the terminal constraint is incorporated as part of the update law of the AE. The Kalman gain is obtained by using the AE parameters, while the control input is calculated by using AE and the system state vector. Next, to relax the need for state availability, an adaptive observer is proposed so that the linear quadratic regulator design uses the reconstructed states and outputs. For the time-invariant linear discrete-time systems, the closed-loop dynamics becomes non-autonomous and involved but verified by using standard Lyapunov and geometric sequence theory. Effectiveness of the proposed approach is verified by using simulation results. The proposed linear quadratic regulator design for the uncertain linear system requires an initial admissible control input and yields a forward-in-time and online solution without needing value and/or policy iterations. Copyright © 2014 John Wiley & Sons, Ltd.

[1]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[2]  Paul J. Webros A menu of designs for reinforcement learning over time , 1990 .

[3]  Sarangapani Jagannathan,et al.  Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Andrew G. Barto,et al.  Adaptive linear quadratic control using policy iteration , 1994, Proceedings of 1994 American Control Conference - ACC '94.

[5]  D. Liu,et al.  Adaptive Dynamic Programming for Finite-Horizon Optimal Control of Discrete-Time Nonlinear Systems With $\varepsilon$-Error Bound , 2011, IEEE Transactions on Neural Networks.

[6]  Frank L. Lewis,et al.  Neural Network Control Of Robot Manipulators And Non-Linear Systems , 1998 .

[7]  Ali Heydari,et al.  Finite-Horizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[8]  John B. Moore,et al.  Persistence of Excitation in Linear Systems , 1985, 1985 American Control Conference.

[9]  Katsuhiko Ogata,et al.  Discrete-time control systems (2nd ed.) , 1995 .

[10]  Martin Guay,et al.  Adaptive Model Predictive Control for Constrained Nonlinear Systems , 2008 .

[11]  C. Watkins Learning from delayed rewards , 1989 .

[12]  Katsuhiko Ogata,et al.  Discrete-time control systems , 1987 .

[13]  Frank L. Lewis,et al.  Aircraft Control and Simulation , 1992 .

[14]  Radoslaw Romuald Zakrzewski,et al.  Neural network control of nonlinear discrete time systems , 1994 .

[15]  Victor M. Becerra,et al.  Optimal control , 2008, Scholarpedia.

[16]  Frank L. Lewis,et al.  A Neural Network Solution for Fixed-Final Time Optimal Control of Nonlinear Systems , 2006, 2006 14th Mediterranean Conference on Control and Automation.

[17]  Sarangapani Jagannathan,et al.  Adaptive dynamic programming-based optimal control of unknown affine nonlinear discrete-time systems , 2009, 2009 International Joint Conference on Neural Networks.

[18]  H. Jin Kim,et al.  Model predictive flight control using adaptive support vector regression , 2010, Neurocomputing.

[19]  Hao Xu,et al.  Stochastic optimal control of unknown linear networked control system in the presence of random delays and packet losses , 2012, Autom..

[20]  Anuradha M. Annaswamy,et al.  Stable Adaptive Systems , 1989 .

[21]  I. Sandberg Notes on uniform approximation of time-varying systems on finite time intervals , 1998 .