A Reinforcement Learning - Adaptive Control Architecture for Morphing

Thispaperdevelopsacontrolmethodologyformorphing,whichcombinesMachineLearning and Adaptive Dynamic Inversion Control. The morphing control function, which uses Reinforcement Learning, is integrated with the trajectory tracking function, which uses StructuredAdaptiveModelInversionControl.Optimalityisaddressedbycostfunctionsrepresenting optimal shapes corresponding to specified operating conditions, and an episodic Reinforcement Learning simulation is developed to learn the optimal shape change policy. The methodology is demonstrated by a numerical example of a 3-D morphing air vehicle, which simultaneously tracks a specified trajectory and autonomously morphs over a set of shapes corresponding toflight conditions along the trajectory. Results presented in the paper show that this methodology is capable of learning the required shape and morphing into it, and accurately tracking the reference trajectory in the presence of parametric uncertainties and initial error conditions.

[1]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[2]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[3]  Kamesh Subbarao,et al.  AAS 00-202 1 STRUCTURED ADAPTIVE MODEL INVERSION APPLIED TO TRACKING SPACECRAFT MANEUVERS , 2000 .

[4]  Anna-Maria Rivas McGowan,et al.  The Aircraft Morphing Program , 1998 .

[5]  K. Subbarao,et al.  Structured adaptive model inversion control with actuator saturation constraints applied to tracking spacecraft maneuvers , 2004, Proceedings of the 2004 American Control Conference.

[6]  Stuart E. Dreyfus,et al.  Applied Dynamic Programming , 1965 .

[7]  John Valasek,et al.  Structured adaptive model inversion control with actuator saturation constraints applied to tracking spacecraft maneuvers , 2004 .

[8]  Robert E. Kalaba,et al.  Dynamic Programming and Modern Control Theory , 1966 .

[9]  Robert C. Nelson,et al.  Flight Stability and Automatic Control , 1989 .

[10]  Richard S. Sutton,et al.  Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .

[11]  Kamesh Subbarao,et al.  Structured Adaptive Model Inversion Applied To Tracking Aggressive Aircraft Maneuvers , 2001 .

[12]  Ronald J. Williams,et al.  Analysis of Some Incremental Variants of Policy Iteration: First Steps Toward Understanding Actor-Cr , 1993 .

[13]  Weiping Li,et al.  Applied Nonlinear Control , 1991 .

[14]  R. Bellman Dynamic programming. , 1957, Science.

[15]  Mark W. Spong,et al.  Swinging up the Acrobot: an example of intelligent control , 1994, Proceedings of 1994 American Control Conference - ACC '94.

[16]  John Valasek,et al.  STRUCTURED ADAPTIVE MODEL INVERSION CONTROL TO SIMULTANEOUSLY HANDLE ACTUATOR FAILURE AND ACTUATOR SATURATION , 2003 .

[17]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[18]  Anna-Maria Rivas McGowan,et al.  Recent results from NASA's morphing project , 2002, SPIE Smart Structures and Materials + Nondestructive Evaluation and Health Monitoring.

[19]  Anuradha M. Annaswamy,et al.  Stable Adaptive Systems , 1989 .

[20]  Stamatios V. Kartalopoulos,et al.  Understanding neural networks and fuzzy logic - basic concepts and applications , 1997 .

[21]  Terrence A. Weisshaar,et al.  Evaluating the Impact of Morphing Technologies on Aircraft Performance , 2002 .

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  Monish D. Tandale,et al.  Adaptive Dynamic Inversion Control with Actuator Saturati on Constraints Applied to Tracking Spacecraft Maneuvers , 2004 .

[24]  John L. Junkins,et al.  Adaptive realization of linear closed loop tracking dynamics in the presence of large system model errors , 1999 .

[25]  Thomas Dean,et al.  Toward learning time-varying functions with high input dimensionality , 1990, Proceedings. 5th IEEE International Symposium on Intelligent Control 1990.

[26]  Andrew W. Moore,et al.  Efficient memory-based learning for robot control , 1990 .

[27]  Raymond C. Montgomery,et al.  Subsonic maneuvering effectiveness of high-performance aircraft that employ quasi-static shape change devices , 1998, Smart Structures.

[28]  Dennis S. Bernstein,et al.  Adaptive Asymptotic Tracking of Spacecraft Attitude Motion with Inertia Matrix Identification , 1998 .

[29]  S. Sastry,et al.  Adaptive Control: Stability, Convergence and Robustness , 1989 .

[30]  Gary Boone,et al.  Minimum-time control of the Acrobot , 1997, Proceedings of International Conference on Robotics and Automation.