UAV Maneuvering Target Tracking in Uncertain Environments Based on Deep Reinforcement Learning and Meta-Learning

This paper combines Deep Reinforcement Learning (DRL) with Meta-learning and proposes a novel approach, named Meta Twin Delayed Deep Deterministic policy gradient (Meta-TD3), to realize the control of Unmanned Aerial Vehicle (UAV), allowing a UAV to quickly track a target in an environment where the motion of a target is uncertain. This approach can be applied to a variety of scenarios, such as wildlife protection, emergency aid, and remote sensing. We consider multi-tasks experience replay buffer to provide data for multi-tasks learning of DRL algorithm, and we combine Meta-learning to develop a multi-task reinforcement learning update method to ensure the generalization capability of reinforcement learning. Compared with the state-of-the-art algorithms, Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic policy gradient (TD3), experimental results show that the Meta-TD3 algorithm has achieved a great improvement in terms of both convergence value and convergence rate. In a UAV target tracking problem, Meta-TD3 only requires a few steps to train to enable a UAV to adapt quickly to a new target movement mode more and maintain a better tracking effectiveness.

[1]  Xiaoguang Gao,et al.  Robust Motion Control for UAV in Dynamic Uncertain Environments Using Deep Reinforcement Learning , 2020, Remote. Sens..

[2]  Bo Yang,et al.  Offloading Optimization in Edge Computing for Deep-Learning-Enabled Target Tracking by Internet of UAVs , 2020, IEEE Internet of Things Journal.

[3]  Miguel A. Olivares-Méndez,et al.  Towards an Autonomous Vision-Based Unmanned Aerial System against Wildlife Poachers , 2015, Sensors.

[4]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[5]  Anandarup Mukherjee,et al.  Distributed aerial processing for IoT-based edge UAV swarms in smart farming , 2020, Comput. Networks.

[6]  Yunjie Wu,et al.  Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning , 2020, IEEE Access.

[7]  Long Ji Lin,et al.  Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[8]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[9]  Fumiyuki Adachi,et al.  Deep Reinforcement Learning for UAV Navigation Through Massive MIMO Technique , 2019, IEEE Transactions on Vehicular Technology.

[10]  Naixue Xiong,et al.  UAV Autonomous Target Search Based on Deep Reinforcement Learning in Complex Disaster Scene , 2019, IEEE Access.

[11]  Andreas Birk,et al.  Safety, Security, and Rescue Missions with an Unmanned Aerial Vehicle (UAV) , 2011, J. Intell. Robotic Syst..

[12]  ANIL KUMAR YADAV,et al.  AI-based adaptive control and design of autopilot system for nonlinear UAV , 2014 .

[13]  Yuan Shen,et al.  Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach , 2019, IEEE Transactions on Vehicular Technology.