论文信息 - An Application of Reinforcement Learning to Aerobatic Helicopter Flight

An Application of Reinforcement Learning to Aerobatic Helicopter Flight

Autonomous helicopter flight is widely regarded to be a highly challenging control problem. This paper presents the first successful autonomous completion on a real RC helicopter of the following four aerobatic maneuvers: forward flip and sideways roll at low speed, tail-in funnel, and nose-in funnel. Our experimental results significantly extend the state of the art in autonomous helicopter flight. We used the following approach: First we had a pilot fly the helicopter to help us find a helicopter dynamics model and a reward (cost) function. Then we used a reinforcement learning (optimal control) algorithm to find a controller that is optimized for the resulting model and reward function. More specifically, we used differential dynamic programming (DDP), an extension of the linear quadratic regulator (LQR).

[1] B. Anderson,et al. Optimal control: linear quadratic methods , 1990 .

[2] Simon Newman,et al. Basic Helicopter Aerodynamics , 1990 .

[3] Takeo Kanade,et al. System identification of small-size unmanned helicopter dynamics , 1999 .

[4] Michael Kearns,et al. Efficient Reinforcement Learning in Factored MDPs , 1999, IJCAI.

[5] J. Gordon Leishman,et al. Principles of Helicopter Aerodynamics , 2000 .

[6] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[7] Bernard Mettler,et al. Flight test and simulation results for an autonomous aerobatic helicopter , 2002, Proceedings. The 21st Digital Avionics Systems Conference.

[8] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[9] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.

[10] Gaurav S. Sukhatme,et al. Visually guided landing of an unmanned aerial vehicle , 2003, IEEE Trans. Robotics Autom..

[11] John Langford,et al. Exploration in Metric State Spaces , 2003, ICML.

[12] Peter I. Corke,et al. Low-cost flight control system for a small autonomous helicopter , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[13] Pieter Abbeel,et al. Learning first-order Markov models for control , 2004, NIPS.

[14] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.

[15] Eric Feron,et al. Human-Inspired Control Logic for Automated Maneuvering of Miniature Helicopter , 2004 .

[16] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.

[17] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[18] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.

[19] Pieter Abbeel,et al. Learning vehicular dynamics, with application to modeling helicopters , 2005, NIPS.

[20] William C. Messner,et al. Design and Flight Testing of an H00 Controller for a Robotic Helicopter , 2006 .