Autonomous Tracking Using a Swarm of UAVs: A Constrained Multi-Agent Reinforcement Learning Approach

In this paper, we aim to design an autonomous tracking system for a swarm of unmanned aerial vehicles (UAVs) to localize a radio frequency (RF) mobile target. In the system, UAVs equipped with omnidirectional received signal strength (RSS) sensors can cooperatively search the target with a specified tracking accuracy. To achieve fast localization and tracking in the highly dynamic channel environment (e.g., time-varying transmit power and intermittent signal), we formulate a flight decision problem as a constrained Markov decision process (CMDP) with the main objective of avoiding redundant UAV flight path. Then, we propose an enhanced multi-agent reinforcement learning to coordinate multiple UAVs performing real-time target tracking. The core of the proposed scheme is a feedback control system that takes into account the uncertainty of the channel estimate. We prove that the proposed algorithm can converge to the optimal decision. Our simulation results show that the proposed scheme outperforms standard Q-learning and multi-agent Q-learning algorithms in terms of searching time and successful localization probability.

[1]  Sidney N. Givigi,et al.  A Q-Learning Approach to Flocking With UAVs in a Stochastic Environment , 2017, IEEE Transactions on Cybernetics.

[2]  Hai Nguyen,et al.  Review of Deep Reinforcement Learning for Robot Manipulation , 2019, 2019 Third IEEE International Conference on Robotic Computing (IRC).

[3]  Theodoros A. Tsiftsis,et al.  Resource Allocation for Energy Harvesting-Powered D2D Communication Underlaying UAV-Assisted Networks , 2018, IEEE Transactions on Green Communications and Networking.

[4]  Carlos Alberto V. Campos,et al.  Evaluating data communications in natural disaster scenarios using opportunistic networks with Unmanned Aerial Vehicles , 2016, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC).

[5]  Le Zhang,et al.  Improved Multi-Agent Reinforcement Learning for Path Planning-Based Crowd Simulation , 2019, IEEE Access.

[6]  Ahmad A. Masoud,et al.  Harmonic potential based communication-aware navigation and beamforming in cluttered spaces with full channel-state information , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Yazhe Tang,et al.  Vision-Aided Multi-UAV Autonomous Flocking in GPS-Denied Environment , 2019, IEEE Transactions on Industrial Electronics.

[8]  Joshua R. Smith,et al.  Inter-Technology Backscatter: Towards Internet Connectivity for Implanted Devices , 2016, SIGCOMM.

[9]  Fung Po Tso,et al.  Autonomous Flying WiFi Access Point , 2019, 2019 IEEE Symposium on Computers and Communications (ISCC).

[10]  Inés María Galván,et al.  Multi-objective evolutionary optimization of prediction intervals for solar energy forecasting with neural networks , 2017, Inf. Sci..

[11]  Shengjun Wu Illegal radio station localization with UAV-based Q-learning , 2018, China Communications.

[12]  Vijay K. Bhargava,et al.  Machine Learning Methods for RSS-Based User Positioning in Distributed Massive MIMO , 2018, IEEE Transactions on Wireless Communications.

[13]  Xiao Liu,et al.  Trajectory Design and Power Control for Multi-UAV Assisted Wireless Networks: A Machine Learning Approach , 2018, IEEE Transactions on Vehicular Technology.

[14]  Hyo-Sung Ahn,et al.  Convergence of multiagent Q-learning: Multi action replay process approach , 2010, 2010 IEEE International Symposium on Intelligent Control.

[15]  Andrew R. Nix,et al.  Path Loss Models for Air-to-Ground Radio Channels in Urban Environments , 2006, 2006 IEEE 63rd Vehicular Technology Conference.

[16]  Yu-Jia Chen,et al.  A Machine Learning Based Attack in UAV Communication Networks , 2019, 2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall).

[17]  Huosheng Hu,et al.  A Novel Real-Time Moving Target Tracking and Path Planning System for a Quadrotor UAV in Unknown Unstructured Outdoor Scenes , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[18]  Fung Po Tso,et al.  Mobility-Aware Probabilistic Caching in UAV-Assisted Wireless D2D Networks , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[19]  Ian F. Akyildiz,et al.  Help from the Sky: Leveraging UAVs for Disaster Management , 2017, IEEE Pervasive Computing.

[20]  Linling Kuang,et al.  Channel Tracking With Flight Control System for UAV mmWave MIMO Communications , 2018, IEEE Communications Letters.

[21]  Ping Li,et al.  Discrete-Time Multi-Player Games Based on Off-Policy Q-Learning , 2019, IEEE Access.

[22]  I-Ming Chen,et al.  Autonomous navigation of UAV by using real-time model-based reinforcement learning , 2016, 2016 14th International Conference on Control, Automation, Robotics and Vision (ICARCV).

[23]  Li-Chun Wang,et al.  Privacy Protection for Internet of Drones: A Network Coding Approach , 2019, IEEE Internet of Things Journal.

[24]  Kyriakos G. Vamvoudakis,et al.  Deep-Learning Tracking for Autonomous Flying Systems Under Adversarial Inputs , 2020, IEEE Transactions on Aerospace and Electronic Systems.

[25]  Shi Jin,et al.  Beam Tracking for UAV Mounted SatCom on-the-Move With Massive Antenna Array , 2017, IEEE Journal on Selected Areas in Communications.

[26]  Ivan Markovic,et al.  Stochastic Optimization for Trajectory Planning with Heteroscedastic Gaussian Processes , 2019, 2019 European Conference on Mobile Robots (ECMR).

[27]  Beno Benhabib,et al.  Multi-UAV based Autonomous Wilderness Search and Rescue using Target Iso-Probability Curves , 2019, 2019 International Conference on Unmanned Aircraft Systems (ICUAS).

[28]  Hugh H. T. Liu,et al.  Kalman Filter-Based Large-Scale Wildfire Monitoring With a System of UAVs , 2019, IEEE Transactions on Industrial Electronics.

[29]  Zhu Han,et al.  Optimal Placement of Low-Altitude Aerial Base Station for Securing Communications , 2019, IEEE Wireless Communications Letters.

[30]  Andrey V. Savkin,et al.  A Method for Optimized Deployment of Unmanned Aerial Vehicles for Maximum Coverage and Minimum Interference in Cellular Networks , 2019, IEEE Transactions on Industrial Informatics.

[31]  Jian Ma,et al.  Learning-Based Energy-Efficient Data Collection by Unmanned Vehicles in Smart Cities , 2018, IEEE Transactions on Industrial Informatics.

[32]  Fatih Erden,et al.  RSS-Based Q-Learning for Indoor UAV Navigation , 2019, MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM).

[33]  Andrey V. Savkin,et al.  An Algorithm of Reactive Collision Free 3-D Deployment of Networked Unmanned Aerial Vehicles for Surveillance and Monitoring , 2020, IEEE Transactions on Industrial Informatics.

[34]  Pi-Cheng Hsiu,et al.  A Doppler Effect Based Framework for Wi-Fi Signal Tracking in Search and Rescue Operations , 2018, IEEE Transactions on Vehicular Technology.

[35]  Cheng Wu,et al.  Multiple Sensors Based Prognostics With Prediction Interval Optimization via Echo State Gaussian Process , 2019, IEEE Access.

[36]  Ismail Guvenc,et al.  Receding Horizon Multi-UAV Cooperative Tracking of Moving RF Source , 2017, IEEE Communications Letters.

[37]  Liang Liu,et al.  Regional Cooperative Multi-agent Q-learning Based on Potential Field , 2008, 2008 Fourth International Conference on Natural Computation.

[38]  Kaishun Wu,et al.  Optimal Wireless Information and Energy Transmissions for UAV-Enabled Cognitive Communication Systems , 2018, 2018 IEEE International Conference on Communication Systems (ICCS).

[39]  Mihail L. Sichitiu,et al.  Autonomous Tracking of Intermittent RF Source Using a UAV Swarm , 2018, IEEE Access.

[40]  Grigoriy Fokin Passive Geolocation with Unmanned Aerial Vehicles using AOA Measurement Processing , 2020, 2020 22nd International Conference on Advanced Communication Technology (ICACT).

[41]  Chadi Assi,et al.  Autonomous UAV Trajectory for Localizing Ground Objects: A Reinforcement Learning Approach , 2020, IEEE Transactions on Mobile Computing.

[42]  Eduardo Tovar,et al.  Energy Efficient Legitimate Wireless Surveillance of UAV Communications , 2019, IEEE Transactions on Vehicular Technology.

[43]  Fatih Erden,et al.  Micro-UAV Detection and Classification from RF Fingerprints Using Machine Learning Techniques , 2019, 2019 IEEE Aerospace Conference.

[44]  Qinru Qiu,et al.  Fast and Accurate Trajectory Tracking for Unmanned Aerial Vehicles based on Deep Reinforcement Learning , 2019, 2019 IEEE 25th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA).

[45]  Krzysztof Skonieczny,et al.  EKF and UKF localization of a moving RF ground target using a flying vehicle , 2017, 2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE).

[46]  Andrey V. Savkin,et al.  Method for tracking of environmental level sets by a unicycle-like vehicle , 2012, Autom..

[47]  Andrey V. Savkin,et al.  Sensor-Network-Based Navigation of a Mobile Robot for Extremum Seeking Using a Topology Map , 2019, IEEE Transactions on Industrial Informatics.

[48]  Chadi Abou-Rjeily,et al.  UAV-Aided Cooperation for FSO Communication Systems , 2018, IEEE Communications Magazine.

[49]  Azer Bestavros,et al.  Reinforcement Learning for UAV Attitude Control , 2018, ACM Trans. Cyber Phys. Syst..

[50]  Qi Hao,et al.  Action synchronization between human and UAV robotic arms for remote operation , 2016, 2016 IEEE International Conference on Mechatronics and Automation.

[51]  Manuela Veloso,et al.  Multiagent learning in the presence of agents with limitations , 2003 .

[52]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[53]  Brian Bingham,et al.  Dynamic Plume Tracking by Cooperative Robots , 2019, IEEE/ASME Transactions on Mechatronics.

[54]  Xiaoqiang Ren,et al.  Whittle Index Policy for Dynamic Multichannel Allocation in Remote State Estimation , 2020, IEEE Transactions on Automatic Control.

[55]  Li Wang,et al.  An Improved Algorithm Based on Particle Filter for 3D UAV Target Tracking , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[56]  Byron Boots,et al.  Motion Planning as Probabilistic Inference using Gaussian Processes and Factor Graphs , 2016, Robotics: Science and Systems.

[57]  Michael W. Shafer,et al.  UAV-RT: An SDR Based Aerial Platform for Wildlife Tracking , 2018, 2018 IEEE 88th Vehicular Technology Conference (VTC-Fall).

[58]  N. Kemal Ure,et al.  Localization and tracking of RF emitting targets with multiple unmanned aerial vehicles in large scale environments with uncertain transmitter power , 2017, 2017 International Conference on Unmanned Aircraft Systems (ICUAS).