H∞ Control of Nonaffine Aerial Systems Using Off-policy Reinforcement Learning

This paper presents a completely model-free H∞ optimal tracking solution to the control of a general class of nonlinear nonaffine systems in the presence of the input constraints. The proposed method is motivated by nonaffine unmanned aerial vehicle (UAV) system as a real application. First, a general class of nonlinear nonaffine system dynamics is presented as an affine system in terms of a nonlinear function of the control input. It is shown that the optimal control of nonaffine systems may not have an admissible solution if the utility function is not defined properly. Moreover, the boundness of the optimal control input cannot be guaranteed for standard performance functions. A new performance function is defined and used in the L2-gain condition for this class of nonaffine system. This performance function guarantees the existence of an admissible solution (if any exists) and boundness of the control input solution. An off-policy reinforcement learning (RL) is employed to iteratively solve the H∞ optimal tracking control online using the measured data along the system trajectories. The proposed off-policy RL does not require any knowledge of the system dynamics. Moreover, the disturbance input does not need to be adjustable in a specific manner.

[1]  Frank L. Lewis,et al.  $ {H}_{ {\infty }}$ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Derong Liu,et al.  Finite-Approximation-Error-Based Discrete-Time Iterative Adaptive Dynamic Programming , 2014, IEEE Transactions on Cybernetics.

[3]  Derong Liu,et al.  Adaptive Dynamic Programming for Optimal Tracking Control of Unknown Nonlinear Systems With Application to Coal Gasification , 2014, IEEE Transactions on Automation Science and Engineering.

[4]  Frank L. Lewis,et al.  Policy Iterations on the Hamilton–Jacobi–Isaacs Equation for $H_{\infty}$ State Feedback Control With Input Saturation , 2006, IEEE Transactions on Automatic Control.

[5]  Frank L. Lewis,et al.  Actor–Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Qinglai Wei,et al.  A Novel Iterative $\theta $-Adaptive Dynamic Programming for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Automation Science and Engineering.

[7]  Huai-Ning Wu,et al.  Neural Network Based Online Simultaneous Policy Update Algorithm for Solving the HJI Equation in Nonlinear $H_{\infty}$ Control , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Warren B. Powell,et al.  Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[9]  Frank L. Lewis,et al.  Optimal Control: Lewis/Optimal Control 3e , 2012 .

[10]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[11]  Zhong-Ping Jiang,et al.  Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics , 2012, Autom..

[12]  Frank L. Lewis,et al.  2009 Special Issue: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems , 2009 .

[13]  Frank L. Lewis,et al.  Reinforcement Learning for Partially Observable Dynamic Processes: Adaptive Dynamic Programming Using Measured Output Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  Frank L. Lewis,et al.  Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning , 2014, Autom..

[15]  Huaguang Zhang,et al.  Adaptive Dynamic Programming for a Class of Complex-Valued Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Zhong-Ping Jiang,et al.  Adaptive dynamic programming and optimal control of nonlinear nonaffine systems , 2014, Autom..