Unmanned aerial vehicles (UAV) heading optimal tracking control using online kernel-based HDP algorithm

UAV can work in places that are dangerous, or not easy to reach for humans. However, due to active control and operating difficulties, it is still a challenge to develop fully autonomous flight in complex environments. This paper applies a novel heuristic dynamic programming for the UAV heading optimal tracking controller design, using kernel-based heuristic dynamic programming (KHDP). Kernel-based HDP is developed by integrating kernel methods and approximately linear dependence (ALD) analysis with the critic learning of HDP algorithm. Compared with conventional HDP where neural networks are widely used and their features were manually designed, the proposed algorithm can obtain better generalization capability and learning efficiency through applying the sparse kernel machine into the critic learning process of HDP algorithm. Simulation and experimental results of UAV heading optimal tracking control problems demonstrate the effectiveness of the proposed kernel-based HDP algorithm.

[1]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[2]  DerongLiu Approximate Dynamic Programming for Self-Learning Control , 2005 .

[3]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[4]  Frank L. Lewis,et al.  Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach , 2005, Autom..

[5]  Donald A. Sofge,et al.  Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .

[6]  Chen Bing,et al.  Near-optimal Stabilization for a Class of Nonlinear Systems with Control Constraint Based on Single Network Greedy Iterative DHP Algorithm , 2009 .

[7]  Shan Hai-yan Combined DI /QFT flight control for a quad-rotor unmanned helicopter , 2008 .

[8]  Qinglai Wei,et al.  Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..

[9]  Frank L. Lewis,et al.  Guest Editorial: Special Issue on Adaptive Dynamic Programming and Reinforcement Learning in Feedback Control , 2008, IEEE Trans. Syst. Man Cybern. Part B.

[10]  Xin Zhang,et al.  Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method , 2011, IEEE Transactions on Neural Networks.

[11]  Shengli Wu,et al.  Sensitivity-Based Adaptive Learning Rules for Binary Feedforward Neural Networks , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Huaguang Zhang,et al.  Near-optimal Stabilization for a Class of Nonlinear Systems with Control Constraint Based on Single Network Greedy Iterative DHP Algorithm: Near-optimal Stabilization for a Class of Nonlinear Systems with Control Constraint Based on Single Network Greedy Iterative DHP Algorithm , 2009 .

[13]  Haibo He,et al.  Online Learning Control Using Adaptive Critic Designs With Sparse Kernel Machines , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Huaguang Zhang,et al.  Neural-Network-Based Near-Optimal Control for a Class of Discrete-Time Affine Nonlinear Systems With Control Constraints , 2009, IEEE Transactions on Neural Networks.

[15]  Roland Siegwart,et al.  PID vs LQ control techniques applied to an indoor micro quadrotor , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[16]  Justin A. Boyan,et al.  Least-Squares Temporal Difference Learning , 1999, ICML.

[17]  Xin Xu,et al.  Kernel Least-Squares Temporal Difference Learning , 2006 .

[18]  Sarangapani Jagannathan,et al.  Optimal tracking control of affine nonlinear discrete-time systems with unknown internal dynamics , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[19]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[21]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[22]  D. Liu,et al.  Adaptive Dynamic Programming for Finite-Horizon Optimal Control of Discrete-Time Nonlinear Systems With $\varepsilon$-Error Bound , 2011, IEEE Transactions on Neural Networks.

[23]  Sarangapani Jagannathan,et al.  Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence , 2009, Neural Networks.

[24]  Huaguang Zhang,et al.  On-Line Learning Control for Discrete Nonlinear Systems Via an Improved ADDHP Method , 2007, ISNN.

[25]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.

[26]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .