Tracking control of discrete-time affine nonlinear systems based on kernel-HDP algorithm

In the past decade, adaptive dynamic programming (ADP) has been widely used to realize online learning tracking control of dynamical systems, where neural networks with manually designed features are commonly used. In order to improve the generalization capability and learning efficiency of ADP, this paper presents a novel framework of ADP with sparse kernel machines by integrating kernel methods and approximately linear dependence (ALD) analysis into the critic module of ADP for the optimal tracking controller design. An ADP algorithm based on sparse kernel learning and heuristic dynamic programming (HDP) is proposed, that is, kernel HDP (KHDP). Based on KHDP, an experiment is established. By simulation, the effectiveness of proposed algorithm is demonstrated.

[1]  Haibo He,et al.  Online Learning Control Using Adaptive Critic Designs With Sparse Kernel Machines , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003 .

[3]  Shie Mannor,et al.  The kernel recursive least-squares algorithm , 2004, IEEE Transactions on Signal Processing.

[4]  Sarangapani Jagannathan,et al.  Optimal tracking control of affine nonlinear discrete-time systems with unknown internal dynamics , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[5]  Shengli Wu,et al.  Sensitivity-Based Adaptive Learning Rules for Binary Feedforward Neural Networks , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Huaguang Zhang,et al.  Asymptotic tracking control scheme for mechanical systems with external disturbances and friction , 2010, Neurocomputing.

[7]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[8]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[9]  Andrew G. Barto,et al.  Elevator Group Control Using Multiple Reinforcement Learning Agents , 1998, Machine Learning.

[10]  Frank L. Lewis,et al.  Adaptive optimal control for continuous-time linear systems based on policy iteration , 2009, Autom..

[11]  Gerald Tesauro,et al.  TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.

[12]  Shalabh Bhatnagar,et al.  Natural actor-critic algorithms , 2009, Autom..

[13]  Frank L. Lewis,et al.  Online solution of nonlinear two‐player zero‐sum games using synchronous policy iteration , 2012 .

[14]  Justin A. Boyan,et al.  Least-Squares Temporal Difference Learning , 1999, ICML.

[15]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[16]  Alex Smola,et al.  Kernel methods in machine learning , 2007, math/0701907.

[17]  Michail G. Lagoudakis,et al.  Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[18]  Qinglai Wei,et al.  Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming , 2012, Autom..

[19]  George D. Magoulas,et al.  Effective Backpropagation Training with Variable Stepsize , 1997, Neural Networks.

[20]  Zhang Yong Approximate optimal output tracking control for nonlinear discrete-time systems , 2010 .

[21]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[22]  Kwang Y. Lee,et al.  An optimal tracking neuro-controller for nonlinear dynamic systems , 1996, IEEE Trans. Neural Networks.

[23]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[24]  B. Paden,et al.  Nonlinear inversion-based output tracking , 1996, IEEE Trans. Autom. Control..

[25]  Yi Zhang,et al.  A self-learning call admission control scheme for CDMA cellular networks , 2005, IEEE Transactions on Neural Networks.

[26]  Madan Gopal,et al.  SVM-Based Tree-Type Neural Networks as a Critic in Adaptive Critic Designs for Control , 2007, IEEE Transactions on Neural Networks.

[27]  Xin Xu,et al.  Kernel Least-Squares Temporal Difference Learning , 2006 .

[28]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[29]  Huaguang Zhang,et al.  A Novel Infinite-Time Optimal Tracking Control Scheme for a Class of Discrete-Time Nonlinear Systems via the Greedy HDP Iteration Algorithm , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).