论文信息 - UAV Maneuvering Target Tracking in Uncertain Environments Based on Deep Reinforcement Learning and Meta-Learning

UAV Maneuvering Target Tracking in Uncertain Environments Based on Deep Reinforcement Learning and Meta-Learning

This paper combines Deep Reinforcement Learning (DRL) with Meta-learning and proposes a novel approach, named Meta Twin Delayed Deep Deterministic policy gradient (Meta-TD3), to realize the control of Unmanned Aerial Vehicle (UAV), allowing a UAV to quickly track a target in an environment where the motion of a target is uncertain. This approach can be applied to a variety of scenarios, such as wildlife protection, emergency aid, and remote sensing. We consider multi-tasks experience replay buffer to provide data for multi-tasks learning of DRL algorithm, and we combine Meta-learning to develop a multi-task reinforcement learning update method to ensure the generalization capability of reinforcement learning. Compared with the state-of-the-art algorithms, Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic policy gradient (TD3), experimental results show that the Meta-TD3 algorithm has achieved a great improvement in terms of both convergence value and convergence rate. In a UAV target tracking problem, Meta-TD3 only requires a few steps to train to enable a UAV to adapt quickly to a new target movement mode more and maintain a better tracking effectiveness.

[1] Xiaoguang Gao,et al. Robust Motion Control for UAV in Dynamic Uncertain Environments Using Deep Reinforcement Learning , 2020, Remote. Sens..

[2] Bo Yang,et al. Offloading Optimization in Edge Computing for Deep-Learning-Enabled Target Tracking by Internet of UAVs , 2020, IEEE Internet of Things Journal.

[3] Miguel A. Olivares-Méndez,et al. Towards an Autonomous Vision-Based Unmanned Aerial System against Wildlife Poachers , 2015, Sensors.

[4] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[5] Anandarup Mukherjee,et al. Distributed aerial processing for IoT-based edge UAV swarms in smart farming , 2020, Comput. Networks.

[6] Yunjie Wu,et al. Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning , 2020, IEEE Access.

[7] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[8] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[9] Fumiyuki Adachi,et al. Deep Reinforcement Learning for UAV Navigation Through Massive MIMO Technique , 2019, IEEE Transactions on Vehicular Technology.

[10] Naixue Xiong,et al. UAV Autonomous Target Search Based on Deep Reinforcement Learning in Complex Disaster Scene , 2019, IEEE Access.

[11] Andreas Birk,et al. Safety, Security, and Rescue Missions with an Unmanned Aerial Vehicle (UAV) , 2011, J. Intell. Robotic Syst..

[12] ANIL KUMAR YADAV,et al. AI-based adaptive control and design of autopilot system for nonlinear UAV , 2014 .

[13] Yuan Shen,et al. Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach , 2019, IEEE Transactions on Vehicular Technology.