Online policy iterations for optimal control of input-saturated systems

This work proposes an online policy iteration procedure for the synthesis of sub-optimal control laws for uncertain Linear Time Invariant (LTI) Asymptotically Null-Controllable with Bounded Inputs (ANCBI) systems. The proposed policy iteration method relies on: a policy evaluation step with a piecewise quadratic Lyapunov function in both the state and the deadzone functions of the input signals; a policy improvement step which guarantees at the same time close to optimality (exploitation) and persistence of excitation (exploration). The proposed approach guarantees convergence of the trajectory to a neighborhood around the origin. Besides, the trajectories can be made arbitrarily close to the optimal one provided that the rate at which the the value function and the control policy are updated is fast enough. The solution to the inequalities required to hold at each policy evaluation step can be efficiently implemented with semidefinite programming (SDP) solvers. A numerical example illustrates the results.

[1]  James C. Spall,et al.  Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.

[2]  Dimitri Peaucelle,et al.  SEDUMI INTERFACE 1.02: a tool for solving LMI problems with SEDUMI , 2002, Proceedings. IEEE International Symposium on Computer Aided Control System Design.

[3]  Mohammed M'Saad,et al.  Direct adaptive control subject to input amplitude constraint , 2000, IEEE Trans. Autom. Control..

[4]  A. Fuller In-the-large stability of relay and saturating control systems with linear controllers , 1969 .

[5]  Frank L. Lewis,et al.  A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems , 2013, Autom..

[6]  Eduardo Sontag,et al.  A general result on the stabilization of linear systems using bounded controls , 1994, IEEE Trans. Autom. Control..

[7]  Paul J. Werbos,et al.  Approximate dynamic programming for real-time control and neural modeling , 1992 .

[8]  Elias B. Kosmatopoulos,et al.  Piecewise polynomial policy iterations for synthesis of optimal control laws in input-saturated systems , 2015, 2015 American Control Conference (ACC).

[9]  Isabelle Queinnec,et al.  A polynomial approach to nonlinear state feedback stabilization of saturated linear systems , 2014, 53rd IEEE Conference on Decision and Control.

[10]  Frank L. Lewis,et al.  Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Guang-Ren Duan,et al.  Global Stabilization of the Double Integrator System With Saturation and Delay in the Input , 2010 .

[12]  Lorenzo Fagiano,et al.  Adaptive receding horizon control for constrained MIMO systems , 2014, Autom..

[13]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[14]  F. Lewis,et al.  Online solution of nonquadratic two‐player zero‐sum games arising in the H ∞  control of constrained input systems , 2014 .

[15]  H. Antosiewicz Review: Joseph La Salle and Solomon Lefschetz, Stability by Liapunov's direct method with applications , 1963 .

[16]  Eduardo D. Sontag,et al.  A general result on the stabilization of linear systems using bounded controls , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[17]  James C. Spall,et al.  Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .

[18]  Solomon Lefschetz,et al.  Stability by Liapunov's Direct Method With Applications , 1962 .

[19]  P. Kokotovic,et al.  Inverse Optimality in Robust Stabilization , 1996 .

[20]  Eduardo Sontag,et al.  Nonlinear output feedback design for linear systems with saturating controls , 1990, 29th IEEE Conference on Decision and Control.

[21]  Zongli Lin,et al.  Semi-global Exponential Stabilization of Linear Systems Subject to \input Saturation" via Linear Feedbacks , 1993 .

[22]  Peter J Seiler,et al.  SOSTOOLS: Sum of squares optimization toolbox for MATLAB , 2002 .

[23]  Sophie Tarbouriech,et al.  Piecewise-Linear Robust Control of Systems with Input Constraints , 1999, Eur. J. Control.

[24]  Luca Zaccarian,et al.  Piecewise-quadratic Lyapunov functions for systems with deadzones or saturations , 2009, Syst. Control. Lett..

[25]  E. B. Kosmotapoulos An adaptive optimization scheme with satisfactory transient performance. , 2009 .

[26]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[27]  Robin J. Evans,et al.  Continuous direct adaptive control with saturation input constraint , 1994, IEEE Trans. Autom. Control..

[28]  Donald E. Kirk,et al.  Optimal control theory : an introduction , 1970 .