A Reinforcement Learning Approach for Control of a Nature-Inspired Aerial Vehicle

In this work, reinforcement learning is used to develop a position controller for an underactuated natureinspired Unmanned Aerial Vehicle (UAV). This particular configuration of UAVs achieves lift by spinning its entire body contrary to standard multi-rotors or fixed-wing aircraft. Deep Deterministic Policy Gradients (DDPG) with Ape-X Distributed Prioritized Experience Replay was used to train neural network function approximators that were implemented as the final control policy. The reinforcement learning agent was trained in simulations and directly ported over to real-life hardware. Position control tests were performed on the learned control policy and compared to a baseline PID controller. The learned controller was found to exhibit better control over the inherent oscillations that arise from the non-linear dynamics of the platform.

[1]  David Budden,et al.  Distributed Prioritized Experience Replay , 2018, ICLR.

[2]  Marko Topič,et al.  Mathematical Model of a Monocopter Based on Unsteady Blade-Element Momentum Theory , 2015 .

[3]  Gim Song Soh,et al.  Design and dynamic analysis of a Transformable Hovering Rotorcraft (THOR) , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  Marc Pollefeys,et al.  Design and implementation of an unmanned tail-sitter , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[6]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[7]  Sergey Levine,et al.  Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Michael I. Jordan,et al.  RLlib: Abstractions for Distributed Reinforcement Learning , 2017, ICML.

[9]  Andreas Kellas,et al.  The guided samara : design and development of a controllable single-bladed autorotating vehicle , 2007 .

[10]  Ben Tse,et al.  Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.

[11]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[12]  Guy Lever,et al.  Deterministic Policy Gradient Algorithms , 2014, ICML.

[13]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[14]  G. Uhlenbeck,et al.  On the Theory of the Brownian Motion , 1930 .

[15]  Roland Siegwart,et al.  Control of a Quadrotor With Reinforcement Learning , 2017, IEEE Robotics and Automation Letters.

[16]  Lydia Tapia,et al.  A reinforcement learning approach towards autonomous suspended load manipulation using aerial robots , 2013, 2013 IEEE International Conference on Robotics and Automation.

[17]  J. Sean Humbert,et al.  Pitch and Heave Control of Robotic Samara Micro Air Vehicles , 2010 .