论文信息 - Online Adaptive Critic Flight Control using Approximated Plant Dynamics

Online Adaptive Critic Flight Control using Approximated Plant Dynamics

A relatively new approach to adaptive flight control is the use of reinforcement learning methods such as the adaptive critic designs. Controllers that apply reinforcement learning methods learn by interaction with the environment and their ability to adapt themselves online makes them especially useful in adaptive and reconfigurable flight control systems. This paper is focused on two types of adaptive critic design, one is action dependent and the other uses an approximation of the plant dynamics. The goal of this paper is to gain insight into the theoretical and practical differences between these two controllers, when applied in an online environment with changing plant dynamics. To investigate the practical differences the controllers are implemented for a model of the general dynamics F-16 and the characteristics of the controllers are investigated and compared to each other by conducting several experiments in two phases. First the controllers are trained offline to control the baseline F-16 model, next the dynamics of the F-16 model are changed online and the controllers will have to adapt to the new plant dynamics. The result from the offline experiments show that the controller with the approximated plant dynamics has a higher success ratio for learning to control the baseline F-16 model. The online experiments further show that this controller outperforms the action dependent controller in adapting to changed plant dynamics

E.V. Kampen | Q.P. Chu | J.A. Mulder

[1] Kenji Doya,et al. Temporal Difference Learning in Continuous Time and Space , 1995, NIPS.

[2] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[3] Robert F. Stengel,et al. Online Adaptive Critic Flight Control , 2004 .

[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5] Jennie Si,et al. Online learning control by association and reinforcement , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[6] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[7] Jennie Si,et al. Helicopter trimming and tracking control using direct neural dynamic programming , 2003, IEEE Trans. Neural Networks.

[8] L. C. Baird,et al. Reinforcement learning in continuous time: advantage updating , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).