Gaussian Processes and Reinforcement Learning for Identification and Control of an Autonomous Blimp

Blimps are a promising platform for aerial robotics and have been studied extensively for this purpose. Unlike other aerial vehicles, blimps are relatively safe and also possess the ability to loiter for long periods. These advantages, however, have been difficult to exploit because blimp dynamics are complex and inherently non-linear. The classical approach to system modeling represents the system as an ordinary differential equation (ODE) based on Newtonian principles. A more recent modeling approach is based on representing state transitions as a Gaussian process (GP). In this paper, we present a general technique for system identification that combines these two modeling approaches into a single formulation. This is done by training a Gaussian process on the residual between the non-linear model and ground truth training data. The result is a GP-enhanced model that provides an estimate of uncertainty in addition to giving better state predictions than either ODE or GP alone. We show how the GP-enhanced model can be used in conjunction with reinforcement learning to generate a blimp controller that is superior to those learned with ODE or GP models alone.

[1]  Frank L. Lewis,et al.  Aircraft Control and Simulation , 1992 .

[2]  Gordon Wyeth,et al.  An Autonomous Blimp , 1998 .

[3]  Josué Jr. Guimarães Ramos,et al.  Airship dynamic modeling for autonomous operation , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[4]  James P. Ostrowski,et al.  Visual servoing with dynamics: control of an unmanned blimp , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[5]  Alexandre Bernardino,et al.  Vision based station keeping and docking for an aerial blimp , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[6]  Michael I. Jordan,et al.  PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.

[7]  Yasmina Bestaoui,et al.  Some insights in path planning of small autonomous blimps , 2001 .

[8]  S. Shankar Sastry,et al.  Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.

[9]  Takeo Kanade,et al.  Image-based tracking control of a blimp , 2003, 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475).

[10]  Roderick Murray-Smith,et al.  Nonlinear Predictive Control with a Gaussian Process Model , 2003, European Summer School on Multi-AgentControl.

[11]  Simon Lacroix,et al.  The Autonomous Blimp Project of LAAS-CNRS: Achievements in Flight Control and Terrain Mapping , 2004, Int. J. Robotics Res..

[12]  Rajesh P. N. Rao,et al.  Dynamic Imitation in a Humanoid Robot through Nonparametric Probabilistic Inference , 2006, Robotics: Science and Systems.

[13]  Roderick Murray-Smith,et al.  Switching and Learning in Feedback Systems , 2008 .

[14]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.