Policy-iteration-based adaptive optimal control for uncertain continuous-time linear systems with excitation signals

This paper proposes a novel policy-iteration-based adaptive optimal scheme for uncertain continuous-time linear systems with excitation signals. The proposed method can solve the related linear quadratic optimal control problem in online fashion exactly and safely. In order to maintain persistence excitation condition, the controller injects the small excitation signals to the system. For this linear system with excitation signals, the policy iteration (PI) technique is investigated to adaptively find the optimal control law in the presence of both internal uncertainties and known excitation signals. For the proposed PI technique, the stability of the closed-loop system and convergence to the optimal solution are mathematically proven. Numerical simulations are carried out to verify the effectiveness of the proposed method.

[1]  Panos M. Pardalos,et al.  Approximate dynamic programming: solving the curses of dimensionality , 2009, Optim. Methods Softw..

[2]  Jay A. Farrell,et al.  Adaptive approximately optimal control of unknown nonlinear systems based on locally weighted learning , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[3]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[4]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[5]  Tomas Landelius,et al.  Reinforcement Learning and Distributed Local Model Synthesis , 1997 .

[6]  Audra E. Kosh,et al.  Linear Algebra and its Applications , 1992 .

[7]  D. Kleinman On an iterative technique for Riccati equation computations , 1968 .

[8]  B. Anderson,et al.  Optimal control: linear quadratic methods , 1990 .

[9]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[10]  Frank L. Lewis,et al.  Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control , 2007, Autom..

[11]  W. Marsden I and J , 2012 .

[12]  Andrew G. Barto,et al.  Adaptive linear quadratic control using policy iteration , 1994, Proceedings of 1994 American Control Conference - ACC '94.

[13]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[14]  Frank L. Lewis,et al.  Policy Iterations on the Hamilton–Jacobi–Isaacs Equation for $H_{\infty}$ State Feedback Control With Input Saturation , 2006, IEEE Transactions on Automatic Control.

[15]  Frank L. Lewis,et al.  Adaptive Critic Designs for Discrete-Time Zero-Sum Games With Application to $H_{\infty}$ Control , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  George G. Lendaris,et al.  Adaptive dynamic programming , 2002, IEEE Trans. Syst. Man Cybern. Part C.