Online adaptation of controller parameters based on approximate dynamic programming

Controller parameter tuning is an integral part of control engineering practice. Existing tuning methods usually start with an accurate mathematical model of the controlled system, which may pose some challenges for practicing engineers dealing with real systems. As such, parameter optimization and adaptation are treated as two independent steps during tuning. To address these issues, we propose a new, online parameterized controller tuning method for a general nonlinear dynamic system. This tuning method is based on direct heuristic dynamic programming (direct HDP), a model-free algorithm in the approximated dynamic programming (ADP) family. By using a Lyapunov stability approach, we provide uniformly ultimately bounded (UUB) results under some mild conditions for controller parameters, the critic neural network weights, and the action neural network weights. Simulation studies based on the benchmark cart-pole system demonstrate adaptability and optimization capabilities of the proposed controller parameter tuning method.

[1]  Xin Xu,et al.  Neural-network-based learning control for the high-speed path tracking of unmanned ground vehicles , 2002, Proceedings. International Conference on Machine Learning and Cybernetics.

[2]  Feng Liu,et al.  Direct heuristic dynamic programming with augmented states , 2011, The 2011 International Joint Conference on Neural Networks.

[3]  Feng Liu,et al.  Incorporating approximate dynamic programming-based parameter tuning into PD-type virtual inertia control of DFIGs , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[4]  Wen Tan,et al.  Unified Tuning of PID Load Frequency Controller for Power Systems via IMC , 2010, IEEE Transactions on Power Systems.

[5]  A. Barto,et al.  LEARNING AND APPROXIMATE DYNAMIC PROGRAMMING Scaling Up to the Real World , 2003 .

[6]  Chao Lu,et al.  Direct Heuristic Dynamic Programming for Damping Oscillations in a Large Power System , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Haibo He,et al.  A three-network architecture for on-line learning and optimization based on adaptive dynamic programming , 2012, Neurocomputing.

[8]  Sarangapani Jagannathan,et al.  Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[9]  F. Lewis,et al.  Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers , 2012, IEEE Control Systems.

[10]  Warren B. Powell,et al.  Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[11]  Keyu Li,et al.  PID Tuning for Optimal Closed-Loop Performance With Specified Gain and Phase Margins , 2013, IEEE Transactions on Control Systems Technology.

[12]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[13]  Frank L. Lewis,et al.  Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[14]  George K. I. Mann,et al.  Time-domain based design and analysis of new PID tuning rules , 2001 .

[15]  Yoh-Han Pao,et al.  Stochastic choice of basis functions in adaptive function approximation and the functional-link net , 1995, IEEE Trans. Neural Networks.

[16]  Huaguang Zhang,et al.  Adaptive Dynamic Programming: An Introduction , 2009, IEEE Computational Intelligence Magazine.

[17]  Wail Gueaieb,et al.  The hierarchical expert tuning of PID controllers using tools of soft computing , 2002, IEEE Trans. Syst. Man Cybern. Part B.

[18]  Jennie Si,et al.  Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.

[19]  Jih-Gau Juang,et al.  PID Control Using Presearched Genetic Algorithms for a MIMO System , 2008, IEEE Trans. Syst. Man Cybern. Part C.

[20]  Kiyong Kim,et al.  Self-Tuning of the PID Controller for a Digital Excitation Control System , 2009, 2009 IEEE Industry Applications Society Annual Meeting.

[21]  Derong Liu,et al.  Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[22]  M. R. Katebi,et al.  Predictive PID controllers , 2001 .

[23]  Ahmad B. Rad,et al.  Self-tuning PID controller using Newton-Raphson search method , 1997, IEEE Trans. Ind. Electron..

[24]  Masayoshi Tomizuka,et al.  Fuzzy gain scheduling of PID controllers , 1993, IEEE Trans. Syst. Man Cybern..

[25]  Feng Liu,et al.  A boundedness result for the direct heuristic dynamic programming , 2012, Neural Networks.