Autonomous UAV Navigation Using Reinforcement Learning

Unmanned aerial vehicles (UAV) are commonly used for missions in unknown environments, where an exact mathematical model of the environment may not be available. This paper provides a framework for using reinforcement learning to allow the UAV to navigate successfully in such environments. We conducted our simulation and real implementation to show how the UAVs can successfully learn to navigate through an unknown environment. Technical aspects regarding to applying reinforcement learning algorithm to a UAV system and UAV flight control were also addressed. This will enable continuing research using a UAV with learning capabilities in more important applications, such as wildfire monitoring, or search and rescue missions.

[1]  Weihua Sheng,et al.  Flocking control of multiple agents in noisy environments , 2010, 2010 IEEE International Conference on Robotics and Automation.

[2]  Weihua Sheng,et al.  Dynamic target tracking and observing in a mobile sensor network , 2012, Robotics Auton. Syst..

[3]  Hung M. La,et al.  A Novel Potential Field Controller for Use on Aerial Robots , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[4]  Lydia Tapia,et al.  Learning swing-free trajectories for UAVs with a suspended load , 2013, 2013 IEEE International Conference on Robotics and Automation.

[5]  I-Ming Chen,et al.  Autonomous navigation of UAV by using real-time model-based reinforcement learning , 2016, 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV).

[6]  Naresh K. Sinha,et al.  Modern Control Systems , 1981, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  Jin Bae Park,et al.  Hovering control of a quadrotor , 2012, 2012 12th International Conference on Control, Automation and Systems.

[8]  Hung M. La,et al.  Dynamic Target Tracking and Obstacle Avoidance using a Drone , 2015, ISVC.

[9]  Darius Burschka,et al.  Toward a Fully Autonomous UAV: Research Platform for Indoor and Outdoor Urban Search and Rescue , 2012, IEEE Robotics & Automation Magazine.

[10]  Sergio Salazar,et al.  Adaptive consensus algorithms for real‐time operation of multi‐agent systems affected by switching network events , 2017 .

[11]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[12]  Wanquan Liu,et al.  Geometric Reinforcement Learning for Path Planning of UAVs , 2015, J. Intell. Robotic Syst..

[13]  Holger Voos,et al.  Controller design for quadrotor UAVs using reinforcement learning , 2010, 2010 IEEE International Conference on Control Applications.

[14]  Hung Manh La,et al.  Multi-Robot Swarm for Cooperative Scalar Field Mapping , 2020, Robotic Systems.

[15]  David Feil-Seifer,et al.  A distributed control framework for a team of unmanned aerial vehicles for dynamic wildfire tracking , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  Weihua Sheng,et al.  Distributed Sensor Fusion for Scalar Field Mapping Using Mobile Sensor Networks , 2013, IEEE Transactions on Cybernetics.

[17]  Quang Phuc Ha,et al.  A novel extended potential field controller for use on aerial robots , 2016, 2016 IEEE International Conference on Automation Science and Engineering (CASE).

[18]  Jun Li,et al.  Dynamic analysis and PID control for a quadrotor , 2011, 2011 IEEE International Conference on Mechatronics and Automation.

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Weihua Sheng,et al.  Multirobot Cooperative Learning for Predator Avoidance , 2015, IEEE Transactions on Control Systems Technology.

[21]  Jiming Chen,et al.  Cooperative flocking and learning in multi-robot systems for predator avoidance , 2013, 2013 IEEE International Conference on Cyber Technology in Automation, Control and Intelligent Systems.

[22]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[23]  C. L. Nascimento,et al.  Design of attitude and path tracking controllers for quad-rotor robots using reinforcement learning , 2012, 2012 IEEE Aerospace Conference.

[24]  Shuzhi Sam Ge,et al.  Dynamic Motion Planning for Mobile Robots Using Potential Field Method , 2002, Auton. Robots.

[25]  Steven Lake Waslander,et al.  Multi-agent quadrotor testbed control design: integral sliding mode vs. reinforcement learning , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26]  Jiming Chen,et al.  Cooperative and Active Sensing in Mobile Sensor Networks for Scalar Field Mapping , 2015, IEEE Trans. Syst. Man Cybern. Syst..