Online Adaptive Incremental Reinforcement Learning Flight Control for a CS-25 Class Aircraft

In recent years Adaptive Critic Designs (ACDs) have been applied to adaptive flight control of uncertain, nonlinear systems. However, these algorithms often rely on representative models as they require an offline training stage. Therefore, they have limited applicability to a system for which no accurate system model is available, nor readily identifiable. Inspired by recent work on Incremental Dual Heuristic Programming (IDHP), this paper derives and analyzes a Reinforcement Learning (RL) based framework for adaptive flight control of a CS-25 class fixed-wing aircraft. The proposed framework utilizes Artificial Neural Networks (ANNs) and includes an additional network structure to improve learning stability. The designed learning controller is implemented to control a high-fidelity, six-degree-of-freedom simulation of the Cessna 550 Citation II PH-LAB research aircraft. It is demonstrated that the proposed framework is able to learn a near-optimal control policy online without a priori knowledge of the system dynamics nor an offline training phase. Furthermore, it is able to generalize and operate the aircraft in not previously encountered flight regimes as well as identify and adapt to unforeseen changes to the aircraft’s dynamics.

[1]  David Woods,et al.  The Risks of Autonomy , 2016 .

[2]  Debashis Sadhukhan,et al.  F8 neurocontroller based on dynamic inversion , 1996 .

[3]  Eugene A. Morelli,et al.  Aircraft system identification : theory and practice , 2006 .

[4]  Peng Lu,et al.  Aircraft Fault-Tolerant Trajectory Control Using Incremental Nonlinear Dynamic Inversion , 2016 .

[5]  S. N. Balakrishnan,et al.  Adaptive-critic based neural networks for aircraft optimal control , 1996 .

[6]  Peter Dayan,et al.  A Neural Substrate of Prediction and Reward , 1997, Science.

[7]  C. Ha Neural Networks Approach to AIAA Aircraft Control Design Challenge , 1995 .

[8]  Dimitri P. Bertsekas,et al.  Missile defense and interceptor allocation by neuro-dynamic programming , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[9]  Jan Albert Mulder,et al.  Lyapunov-based Fault Tolerant Flight Control Designs for a Modern Fighter Aircraft Model , 2009 .

[10]  Erik-Jan van Kampen,et al.  Robust Nonlinear Spacecraft Attitude Control using Incremental Nonlinear Dynamic Inversion. , 2012 .

[11]  Gertjan Looye,et al.  Design and Flight Testing of Incremental Nonlinear Dynamic Inversion-based Control Laws for a Passenger Aircraft , 2018 .

[12]  Jennie Si,et al.  Helicopter trimming and tracking control using direct neural dynamic programming , 2003, IEEE Trans. Neural Networks.

[13]  Robert F. Stengel,et al.  Online Adaptive Critic Flight Control , 2004 .

[14]  Marios M. Polycarpou,et al.  Backstepping-Based Flight Control with Adaptive Function Approximation , 2005 .

[15]  E. van Kampen,et al.  Nonlinear Adaptive Flight Control Using Incremental Approximate Dynamic Programming and Output Feedback , 2017 .

[16]  Peng Lu,et al.  Stability Analysis for Incremental Nonlinear Dynamic Inversion Control , 2018 .

[17]  Bjarne Foss,et al.  Adaptive controllers with a vector variable forgetting factor , 1983, The 22nd IEEE Conference on Decision and Control.

[18]  C. C. de Visser,et al.  Identification of a Cessna Citation II Model Based on Flight Test Data , 2018 .

[19]  Erik-Jan Van Kampen,et al.  Incremental model based online dual heuristic programming for nonlinear adaptive control , 2018 .

[20]  Hans-Dieter Joos,et al.  Design of Robust Dynamic Inversion Control Laws using Multi-Objective Optimization , 2001 .

[21]  Anthony J. Calise,et al.  Nonlinear flight control using neural networks , 1994 .

[22]  Michael Kincheloe,et al.  On-line learning neural-network controllers for autopilot systems , 1995 .

[23]  Warren B. Powell,et al.  Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .

[24]  Warren B. Powell,et al.  Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[25]  E. V. Oort,et al.  Online Aerodynamic Model Structure Selection and Parameter Estimation for Fault Tolerant Control , 2010 .

[26]  Rudolf Kulhavy,et al.  Restricted exponential forgetting in real-time identification , 1985, Autom..

[27]  Jan Albert Mulder,et al.  Nonlinear Flight Control Design Using Constrained Adaptive Backstepping , 2007 .

[28]  Tommaso Mannucci,et al.  Safe Exploration Algorithms for Reinforcement Learning Controllers , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Pedro Simplício,et al.  An acceleration measurements-based approach for helicopter nonlinear flight control using Incremental Nonlinear Dynamic Inversion , 2013 .

[30]  J. A. Mulder,et al.  Continuous Adaptive Critic Flight Control aided with Approximated Plant Dynamics , 2006 .

[31]  Gary J. Balas Flight Control Law Design: An Industry Perspective , 2003, Eur. J. Control.

[32]  Jan Albert Mulder,et al.  Nonlinear Adaptive Trajectory Control Applied to an F-16 Model , 2008 .

[33]  Guido C. H. E. de Croon,et al.  Adaptive Incremental Nonlinear Dynamic Inversion for Attitude Control of Micro Air Vehicles , 2016 .

[34]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[35]  Robert F. Stengel,et al.  Flight Control Design using Nonlinear Inverse Dynamics , 1986, 1986 American Control Conference.

[36]  A. J. Calise Neural networks in nonlinear aircraft flight control , 1995 .

[37]  Jan Albert Mulder,et al.  Robust Flight Control Using Incremental Nonlinear Dynamic Inversion and Angular Acceleration Prediction , 2010 .

[38]  Jan Albert Mulder,et al.  Comparison of Inverse Optimal and Tuning Functions Designs for Adaptive Missile Control , 2007 .

[39]  Peng Lu,et al.  Double-model adaptive fault detection and diagnosis applied to real flight data , 2015 .

[40]  Jan Albert Mulder,et al.  Reentry Flight Controller Design Using Nonlinear Dynamic Inversion , 2003 .