Reinforcement Learning in Multiple-UAV Networks: Deployment and Movement Design

A novel framework is proposed for quality of experience driven deployment and dynamic movement of multiple unmanned aerial vehicles (UAVs). The problem of joint non-convex three-dimensional (3-D) deployment and dynamic movement of the UAVs is formulated for maximizing the sum mean opinion score of ground users, which is proved to be NP-hard. In the aim of solving this pertinent problem, a three-step approach is proposed for attaining 3-D deployment and dynamic movement of multiple UAVs. First, a genetic algorithm based K-means (GAK-means) algorithm is utilized for obtaining the cell partition of the users. Second, Q-learning based deployment algorithm is proposed, in which each UAV acts as an agent, making their own decision for attaining 3-D position by learning from trial and mistake. In contrast to the conventional genetic algorithm based learning algorithms, the proposed algorithm is capable of training the direction selection strategy offline. Third, Q-learning based movement algorithm is proposed in the scenario that the users are roaming. The proposed algorithm is capable of converging to an optimal state. Numerical results reveal that the proposed algorithms show a fast convergence rate after a small number of iterations. Additionally, the proposed Q-learning based deployment algorithm outperforms K-means algorithms and Iterative-GAKmean algorithms with low complexity.

[1]  Yoshikazu Miyanaga,et al.  An Autonomous Learning-Based Algorithm for Joint Channel and Power Level Selection by D2D Pairs in Heterogeneous Cellular Networks , 2016, IEEE Transactions on Communications.

[2]  Shuowen Zhang,et al.  Cellular-Enabled UAV Communication: A Connectivity-Constrained Trajectory Optimization Perspective , 2018, IEEE Transactions on Communications.

[3]  Nadine Le Fort-Piat,et al.  Reward Function and Initial Values: Better Choices for Accelerated Goal-Directed Reinforcement Learning , 2006, ICANN.

[4]  Ryu Miura,et al.  A dynamic trajectory control algorithm for improving the communication throughput and delay in UAV-aided networks , 2016, IEEE Network.

[5]  Shuowen Zhang,et al.  Cellular-Enabled UAV Communication: Trajectory Optimization under Connectivity Constraint , 2017, 2018 IEEE International Conference on Communications (ICC).

[6]  Frank Y. Li,et al.  Low-Power Wide-Area Networks for Sustainable IoT , 2018, IEEE Wireless Communications.

[7]  Walid Saad,et al.  Wireless Communication Using Unmanned Aerial Vehicles (UAVs): Optimal Transport Theory for Hover Time Optimization , 2017, IEEE Transactions on Wireless Communications.

[8]  Xiao Zhang,et al.  Optimization of Emergency UAV Deployment for Providing Wireless Coverage , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[9]  Rui Zhang,et al.  Energy-Efficient UAV Communication With Trajectory Optimization , 2016, IEEE Transactions on Wireless Communications.

[10]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Ryu Miura,et al.  Virtual Cell Based Resource Allocation for Efficient Frequency Utilization in Unmanned Aircraft Systems , 2018, IEEE Transactions on Vehicular Technology.

[12]  Yue Gao,et al.  Sparse Representation for Wireless Communications: A Compressive Sensing Approach , 2018, IEEE Signal Processing Magazine.

[13]  Ismail Güvenç,et al.  UAV Air-to-Ground Channel Characterization for mmWave Systems , 2017, 2017 IEEE 86th Vehicular Technology Conference (VTC-Fall).

[14]  Walid Saad,et al.  Unmanned Aerial Vehicle With Underlaid Device-to-Device Communications: Performance and Tradeoffs , 2015, IEEE Transactions on Wireless Communications.

[15]  Lajos Hanzo,et al.  Nonorthogonal Multiple Access for 5G and Beyond , 2017, Proceedings of the IEEE.

[16]  Lingyang Song,et al.  Reinforcement Learning for Decentralized Trajectory Design in Cellular UAV Networks With Sense-and-Send Protocol , 2018, IEEE Internet of Things Journal.

[17]  Walid Saad,et al.  Communications and Control for Wireless Drone-Based Antenna Array , 2017, IEEE Transactions on Communications.

[18]  Mohsen Guizani,et al.  QoE in multimedia domain: a user-centric quality assessment , 2018, Int. J. Multim. Intell. Secur..

[19]  Walid Saad,et al.  Mobile Unmanned Aerial Vehicles (UAVs) for Energy-Efficient Internet of Things Communications , 2017, IEEE Transactions on Wireless Communications.

[20]  Karina Mabell Gomez,et al.  Aerial-terrestrial communications: terrestrial cooperation and energy-efficient transmissions to aerial base stations , 2014, IEEE Transactions on Aerospace and Electronic Systems.

[21]  Rui Zhang,et al.  Wireless communications with unmanned aerial vehicles: opportunities and challenges , 2016, IEEE Communications Magazine.

[22]  Kandeepan Sithamparanathan,et al.  Optimal LAP Altitude for Maximum Coverage , 2014, IEEE Wireless Communications Letters.

[23]  Xiao Zhang,et al.  Optimal Deployment of UAV Networks for Delivering Emergency Wireless Coverage , 2017, ArXiv.

[24]  Zhi Chen,et al.  Joint Power and Trajectory Design for Physical-Layer Secrecy in the UAV-Aided Mobile Relaying System , 2018, IEEE Access.

[25]  Rui Zhang,et al.  Cyclical Multiple Access in UAV-Aided Communications: A Throughput-Delay Tradeoff , 2016, IEEE Wireless Communications Letters.

[26]  Yuan Shen,et al.  Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach , 2019, IEEE Transactions on Vehicular Technology.

[27]  Xiao Liu,et al.  Deployment and Movement for Multiple Aerial Base Stations by Reinforcement Learning , 2018, 2018 IEEE Globecom Workshops (GC Wkshps).

[28]  Chi Harold Liu,et al.  Energy-Efficient UAV Control for Effective and Fair Communication Coverage: A Deep Reinforcement Learning Approach , 2018, IEEE Journal on Selected Areas in Communications.

[29]  Qingqing Wu,et al.  Common Throughput Maximization in UAV-Enabled OFDMA Systems With Delay Consideration , 2018, IEEE Transactions on Communications.

[30]  Yue Gao,et al.  UAV Communications Based on Non-Orthogonal Multiple Access , 2018, IEEE Wireless Communications.

[31]  Ana Galindo-Serrano,et al.  Distributed Q-Learning for Aggregated Interference Control in Cognitive Radio Networks , 2010, IEEE Transactions on Vehicular Technology.

[32]  Arkady B. Zaslavsky,et al.  Context-Aware QoE Modelling, Measurement, and Prediction in Mobile Computing Systems , 2015, IEEE Transactions on Mobile Computing.

[33]  Lajos Hanzo,et al.  When Machine Learning Meets Big Data: A Wireless Communication Perspective , 2019, IEEE Vehicular Technology Magazine.

[34]  Meena Mahajan,et al.  The planar k-means problem is NP-hard , 2012, Theor. Comput. Sci..

[35]  Shuowen Zhang,et al.  CoMP in the Sky: UAV Placement and Movement Optimization for Multi-User Communications , 2018, IEEE Transactions on Communications.

[36]  Ryu Miura,et al.  AC-POCA: Anticoordination Game Based Partially Overlapping Channels Assignment in Combined UAV and D2D-Based Networks , 2017, IEEE Transactions on Vehicular Technology.

[37]  Andrej Kos,et al.  Novel Cross-Layer QoE-Aware Radio Resource Allocation Algorithms in Multiuser OFDMA Systems , 2014, IEEE Transactions on Communications.

[38]  Walid Saad,et al.  Cellular-Connected UAVs over 5G: Deep Reinforcement Learning for Interference Management , 2018, ArXiv.

[39]  Walid Saad,et al.  Efficient Deployment of Multiple Unmanned Aerial Vehicles for Optimal Wireless Coverage , 2016, IEEE Communications Letters.

[40]  Biing-Hwang Juang,et al.  Deep Learning in Physical Layer Communications , 2018, IEEE Wireless Communications.

[41]  Liang Xiao,et al.  UAV-Aided 5G Communications with Deep Reinforcement Learning Against Jamming , 2018 .

[42]  Xin Wang,et al.  Energy-Efficient Cooperative Relaying for Unmanned Aerial Vehicles , 2016, IEEE Transactions on Mobile Computing.

[43]  Walid Saad,et al.  Caching in the Sky: Proactive Deployment of Cache-Enabled Unmanned Aerial Vehicles for Optimized Quality-of-Experience , 2016, IEEE Journal on Selected Areas in Communications.

[44]  Walid Saad,et al.  Beyond 5G With UAVs: Foundations of a 3D Wireless Cellular Network , 2018, IEEE Transactions on Wireless Communications.

[45]  Lingyang Song,et al.  Cellular UAV-to-X Communications: Design and Optimization for Multi-UAV Networks , 2018, IEEE Transactions on Wireless Communications.

[46]  Zhiguo Ding,et al.  QoE-Based Resource Allocation for Multi-Cell NOMA Networks , 2018, IEEE Transactions on Wireless Communications.

[47]  Lingyang Song,et al.  Joint Trajectory and Power Optimization for UAV Relay Networks , 2018, IEEE Communications Letters.

[48]  Demin Li,et al.  Multicast Capacity for VANETs With Directional Antenna and Delay Constraint Under Random Walk Mobility Model , 2017, IEEE Access.

[49]  Andrej Vilhar,et al.  Base stations placement optimization in wireless networks for emergency communications , 2014, 2014 IEEE International Conference on Communications Workshops (ICC).

[50]  Pingzhi Fan,et al.  The Application of Machine Learning in mmWave-NOMA Systems , 2018, 2018 IEEE 87th Vehicular Technology Conference (VTC Spring).

[51]  Taoka Hidekazu,et al.  Scenarios for 5G mobile and wireless communications: the vision of the METIS project , 2014, IEEE Communications Magazine.

[52]  Sofie Pollin,et al.  LTE in the sky: trading off propagation benefits with interference costs for aerial nodes , 2016, IEEE Communications Magazine.