Incremental model-based global dual heuristic programming with explicit analytical calculations applied to flight control

A novel adaptive dynamic programming method, called incremental model-based global dual heuristic programming, is proposed to generate a self-learning adaptive flight controller, in the absence of sufficient prior knowledge of system dynamics. An incremental technique is employed for online local dynamics identification, instead of the artificial neural networks commonly used in global dual heuristic programming, to enable a fast and precise learning. On the basis of the identified model, two neural networks are adopted to facilitate the implementation of the self-learning controller, by approximating the cost-to-go and the control policy, respectively. The required derivatives of cost-to-go are computed by explicit analytical calculations based on differential operations. Both methods are applied to an online attitude tracking control problem of a nonlinear aerospace system and the results show that the proposed method outperforms conventional global dual heuristic programming in tracking precision, online learning speed, robustness to different initial states and adaptability for fault-tolerant control problems.

[1]  E.V. Kampen,et al.  Online Adaptive Critic Flight Control using Approximated Plant Dynamics , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[2]  Peng Lu,et al.  Stability Analysis for Incremental Nonlinear Dynamic Inversion Control , 2018 .

[3]  Ding Wang,et al.  Robust Policy Learning Control of Nonlinear Plants With Case Studies for a Power System Application , 2020, IEEE Transactions on Industrial Informatics.

[4]  Mihai Lungu,et al.  Neural network based adaptive control of airplane's lateral-directional motion during final approach phase of landing , 2018, Eng. Appl. Artif. Intell..

[5]  Michael Fairbank,et al.  Simple and Fast Calculation of the Second-Order Gradients for Globalized Dual Heuristic Dynamic Programming in Neural Networks , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Erik-Jan Van Kampen,et al.  Incremental Approximate Dynamic Programming for Nonlinear Adaptive Tracking Control with Partial Observability , 2018, Journal of Guidance, Control, and Dynamics.

[7]  Robert F. Stengel,et al.  Online Adaptive Critic Flight Control , 2004 .

[8]  E. van Kampen,et al.  Nonlinear Adaptive Flight Control Using Incremental Approximate Dynamic Programming and Output Feedback , 2017 .

[9]  Roland Siegwart,et al.  Control of a Quadrotor With Reinforcement Learning , 2017, IEEE Robotics and Automation Letters.

[10]  Erik-Jan Van Kampen,et al.  Incremental Model-Based Global Dual Heuristic Programming for Flight Control , 2019 .

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Erik-Jan Van Kampen,et al.  Incremental Backstepping Sliding Mode Fault-Tolerant Flight Control , 2019, AIAA Scitech 2019 Forum.

[13]  Patricia H. Moraes Rego,et al.  Numerical stability improvements of state-value function approximations based on RLS learning for online HDP-DLQR control system design , 2017, Eng. Appl. Artif. Intell..

[14]  Junfei Qiao,et al.  Self-Learning Optimal Regulation for Discrete-Time Nonlinear Systems Under Event-Driven Formulation , 2020, IEEE Transactions on Automatic Control.

[15]  Haibo He,et al.  Adaptive Critic Nonlinear Robust Control: A Survey , 2017, IEEE Transactions on Cybernetics.

[16]  Peng Lu,et al.  Incremental Sliding-Mode Fault-Tolerant Flight Control , 2019, Journal of Guidance, Control, and Dynamics.

[17]  Kyriakos G. Vamvoudakis,et al.  Model-free event-triggered control algorithm for continuous-time linear systems with optimal performance , 2018, Autom..

[18]  Derong Liu,et al.  Neural-Network-Based Optimal Control for a Class of Unknown Discrete-Time Nonlinear Systems Using Globalized Dual Heuristic Programming , 2012, IEEE Transactions on Automation Science and Engineering.

[19]  Ding Wang,et al.  Intelligent Critic Control With Robustness Guarantee of Disturbed Nonlinear Plants , 2020, IEEE Transactions on Cybernetics.

[20]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics , 2019, Wiley Series in Probability and Statistics.

[21]  A. A. Abdullah,et al.  Control Design for F-16 Longitudinal Motion , 2004 .

[22]  Erik-Jan Van Kampen,et al.  Incremental model based online dual heuristic programming for nonlinear adaptive control , 2018 .

[23]  Pieter Abbeel,et al.  Autonomous Helicopter Flight Using Reinforcement Learning , 2010, Encyclopedia of Machine Learning.

[24]  Haibo He,et al.  Event-Triggered Globalized Dual Heuristic Programming and Its Application to Networked Control Systems , 2019, IEEE Transactions on Industrial Informatics.

[25]  Frank L. Lewis,et al.  Model-Free Gradient-Based Adaptive Learning Controller for an Unmanned Flexible Wing Aircraft , 2018, Robotics.

[26]  Ali Khaki Sedigh,et al.  $H_{\infty}$ Static Output-Feedback Control Design for Discrete-Time Systems Using Reinforcement Learning , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[27]  E. van Kampen,et al.  Stability and Robustness Analysis and Improvements for Incremental Nonlinear Dynamic Inversion Control , 2018 .

[28]  Derong Liu,et al.  Neural-network-observer-based optimal control for unknown nonlinear systems using adaptive dynamic programming , 2013, Int. J. Control.

[29]  Derong Liu,et al.  Online identifier–actor–critic algorithm for optimal control of nonlinear systems , 2017 .

[30]  Frank L. Lewis,et al.  A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems , 2013, Autom..