Online policy iteration ADP-based attitude-tracking control for hypersonic vehicles

Abstract An online adaptive dynamic programming (ADP) attitude-tracking controller based on policy iteration is proposed, aiming to approach the optimal control of hypersonic vehicles (HVs). The Bellman equation, known as the principal recursive dynamic programming formula, is provided to obtain the controller. In particular, the control action is generated by the ADP controller to track the attitude trajectory. In order to approach optimal control in the uncertain nonlinear HVs system, we use policy iteration to approximate the Bellman equation and build an actor-predictor-critic framework, in which the action network, state estimator and critic network are adopted to implement the policy iteration. Meanwhile, an offline learning method is provided to approach the initial value of iterative computations and improve the efficiency of online learning. The comparative simulations demonstrate the good performance of PIADP with aerodynamic parameter perturbations and random disturbances.

[1]  Stephan Rudolph,et al.  Adaptive neural control of the deployment procedure for tether-assisted re-entry , 2004 .

[2]  Jiaqi Huang,et al.  A guaranteed transient performance-based adaptive neural control scheme with low-complexity computation for flexible air-breathing hypersonic vehicles , 2016 .

[3]  Derong Liu,et al.  Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Chaoyang Dong,et al.  Switched adaptive active disturbance rejection control of variable structure near space vehicles based on adaptive dynamic programming , 2019, Chinese Journal of Aeronautics.

[5]  Frank L. Lewis,et al.  Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2010, Autom..

[6]  F.L. Lewis,et al.  Reinforcement learning and adaptive dynamic programming for feedback control , 2009, IEEE Circuits and Systems Magazine.

[7]  Yang Xiong,et al.  Adaptive Dynamic Programming with Applications in Optimal Control , 2017 .

[8]  Chaoyang Dong,et al.  Barrier Lyapunov function based reinforcement learning control for air-breathing hypersonic vehicle with variable geometry inlet , 2020 .

[9]  Robert Kozma,et al.  Complete stability analysis of a heuristic approximate dynamic programming control design , 2015, Autom..

[10]  Poom Kumam,et al.  Robust optimal sliding mode control for spacecraft position and attitude maneuvers , 2015 .

[11]  Haibo He,et al.  Air-Breathing Hypersonic Vehicle Tracking Control Based on Adaptive Dynamic Programming , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Jan Albert Mulder,et al.  Reentry Flight Controller Design Using Nonlinear Dynamic Inversion , 2003 .

[13]  Petros A. Ioannou,et al.  Adaptive Sliding Mode Control Design fo ra Hypersonic Flight Vehicle , 2004 .

[14]  R. Bellman Dynamic programming. , 1957, Science.

[15]  Xin Zhang,et al.  Data-Driven Robust Approximate Optimal Tracking Control for Unknown General Nonlinear Systems Using Adaptive Dynamic Programming Method , 2011, IEEE Transactions on Neural Networks.

[16]  Jinyu Wen,et al.  Adaptive Learning in Tracking Control Based on the Dual Critic Network Design , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Shaocheng Tong,et al.  Neural Networks-Based Adaptive Finite-Time Fault-Tolerant Control for a Class of Strict-Feedback Switched Nonlinear Systems , 2019, IEEE Transactions on Cybernetics.

[18]  Derong Liu,et al.  Event-Triggered Optimal Control With Performance Guarantees Using Adaptive Dynamic Programming , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Averill M. Law,et al.  The art and theory of dynamic programming , 1977 .

[20]  Peter N. Nikiforuk,et al.  Flight Control Design of an Unmanned Space Vehicle Using Gain Scheduling , 2005 .

[21]  Frank L. Lewis,et al.  Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[22]  Rajnish Sharma,et al.  Near optimal finite-time terminal controllers for space trajectories via SDRE-based approach using dynamic programming , 2018 .

[23]  Yu Wu,et al.  An adaptive reentry guidance method considering the influence of blackout zone , 2018 .

[24]  A B Novinzadeh,et al.  Designing a closed-loop guidance system to increase the accuracy of satellite-carrier boosters' landing point , 2018 .

[25]  Haibo He,et al.  Data-Driven Tracking Control With Adaptive Dynamic Programming for a Class of Continuous-Time Nonlinear Systems , 2017, IEEE Transactions on Cybernetics.

[26]  Zhongke Shi,et al.  DOB-Based Neural Control of Flexible Hypersonic Flight Vehicle Considering Wind Effects , 2017, IEEE Transactions on Industrial Electronics.

[27]  Derong Liu,et al.  Event-Triggered Adaptive Critic Control Design for Discrete-Time Constrained Nonlinear Systems , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[28]  Hadi Razmi,et al.  Neural network-based adaptive sliding mode control design for position and attitude control of a quadrotor UAV , 2019, Aerospace Science and Technology.

[29]  M. Mirmirani,et al.  Development of an Aerodynamic Database for a Generic Hypersonic Air Vehicle , 2005 .

[30]  Gang Chen,et al.  Morphing control of a new bionic morphing UAV with deep reinforcement learning , 2019, Aerospace Science and Technology.

[31]  Chaoyang Dong,et al.  Morphing aircraft control based on switched nonlinear systems and adaptive dynamic programming , 2019, Aerospace Science and Technology.

[32]  Changyin Sun,et al.  Adaptive sliding mode control for re-entry attitude of near space hypersonic vehicle based on backstepping design , 2015, IEEE/CAA Journal of Automatica Sinica.

[33]  Derong Liu,et al.  Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach , 2012, Neurocomputing.