Improved Adaptive–Reinforcement Learning Control for Morphing Unmanned Air Vehicles

This paper presents an improved adaptive-reinforcement learning control methodology for the problem of unmanned air vehicle morphing control. The reinforcement learning morphing control function that learns the optimal shape change policy is integrated with an adaptive dynamic inversion control trajectory tracking function. An episodic unsupervised learning simulation using the Q-learning method is developed to replace an earlier and less accurate actor-critic algorithm. Sequential function approximation, a Galerkin-based scattered data approximation scheme, replaces a K-nearest neighbors (KNN) method and is used to generalize the learning from previously experienced quantized states and actions to the continuous state-action space, all of which may not have been experienced before. The improved method showed smaller errors and improved learning of the optimal shape compared to the KNN.

[1]  Samy Bengio,et al.  SVMTorch: Support Vector Machines for Large-Scale Regression Problems , 2001, J. Mach. Learn. Res..

[2]  Monish D. Tandale,et al.  A Reinforcement Learning - Adaptive Control Architecture for Morphing , 2004, J. Aerosp. Comput. Inf. Commun..

[3]  David Lee Thomson Sequential function approximation of the radiative transfer equation , 2000 .

[4]  Steven C. Chapra,et al.  Numerical Methods for Engineers , 1986 .

[5]  John C. Platt A Resource-Allocating Network for Function Interpolation , 1991, Neural Computation.

[6]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[7]  John Valasek,et al.  Fault-Tolerant Structured Adaptive Model Inversion Control , 2006 .

[8]  John Valasek,et al.  Structured adaptive model inversion control with actuator saturation constraints applied to tracking spacecraft maneuvers , 2004 .

[9]  C. Fletcher Computational Galerkin Methods , 1983 .

[10]  L. Jones Constructive approximations for neural networks by sigmoidal functions , 1990, Proc. IEEE.

[11]  Weiping Li,et al.  Applied Nonlinear Control , 1991 .

[12]  Anna-Maria Rivas McGowan,et al.  Recent results from NASA's morphing project , 2002, SPIE Smart Structures and Materials + Nondestructive Evaluation and Health Monitoring.

[13]  Kamesh Subbarao Structured adaptive model inversion (SAMI): Theory and applications to trajectory tracking for non-linear dynamical systems , 2001 .

[14]  Gary Boone,et al.  Minimum-time control of the Acrobot , 1997, Proceedings of International Conference on Robotics and Automation.

[15]  Kamesh Subbarao,et al.  Structured Adaptive Model Inversion Applied To Tracking Aggressive Aircraft Maneuvers , 2001 .

[16]  Robert C. Nelson,et al.  Flight Stability and Automatic Control , 1989 .

[17]  Richard S. Sutton,et al.  Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.

[18]  Mark W. Spong,et al.  Swinging up the Acrobot: an example of intelligent control , 1994, Proceedings of 1994 American Control Conference - ACC '94.

[19]  Andrew J. Meade,et al.  Approximation properties of local bases assembled from neural network transfer functions , 1998 .

[20]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.

[21]  Andrew J. Meade,et al.  Solution of the radiative transfer equation in discrete ordinate form by sequential function approximation , 2001 .

[22]  Dennis S. Bernstein,et al.  Adaptive Asymptotic Tracking of Spacecraft Attitude Motion with Inertia Matrix Identification , 1998 .

[23]  Anuradha M. Annaswamy,et al.  Robust Adaptive Control , 1984, 1984 American Control Conference.

[24]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[25]  George Cybenko,et al.  Ill-Conditioning in Neural Network Training Problems , 1993, SIAM J. Sci. Comput..

[26]  Monish D. Tandale,et al.  Adaptive Dynamic Inversion Control with Actuator Saturati on Constraints Applied to Tracking Spacecraft Maneuvers , 2004 .

[27]  Anna-Maria Rivas McGowan,et al.  The Aircraft Morphing Program , 1998 .

[28]  John Valasek,et al.  STRUCTURED ADAPTIVE MODEL INVERSION CONTROL TO SIMULTANEOUSLY HANDLE ACTUATOR FAILURE AND ACTUATOR SATURATION , 2003 .