Reinforcement Learning-Based Trajectory Design for the Drone Base Stations

In this paper, the trajectory optimization problem for a multi-aerial base station (ABS) communication network is investigated. The objective is to find the trajectory of the ABSs so that the sum-rate of the users served by each ABS is maximized. To reach this goal, along with the optimal trajectory design, optimal power and sub-channel allocation is also of great importance to support the users with the highest possible data rates. To solve this complicated problem, we divide it into two sub-problems: ABS trajectory optimization sub-problem, and joint power and sub-channel assignment sub-problem. Then, based on the Q-learning method, we develop a distributed algorithm which solves these sub-problems efficiently, and does not need significant amount of information exchange between the ABSs and the core network. Simulation results show that although Q-learning is a model-free reinforcement learning technique, it has a remarkable capability to train the ABSs to optimize their trajectories based on the received reward signals, which carry decent information from the topology of the network.

[1]  Kandeepan Sithamparanathan,et al.  Optimal LAP Altitude for Maximum Coverage , 2014, IEEE Wireless Communications Letters.

[2]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[3]  Halim Yanikomeroglu,et al.  3-D Placement of an Unmanned Aerial Vehicle Base Station (UAV-BS) for Energy-Efficient Maximal Coverage , 2017, IEEE Wireless Communications Letters.

[4]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[5]  Qingqing Wu,et al.  Common Throughput Maximization in UAV-Enabled OFDMA Systems With Delay Consideration , 2018, IEEE Transactions on Communications.

[6]  Rui Zhang,et al.  Cellular-Connected UAV: Potential, Challenges, and Promising Technologies , 2018, IEEE Wireless Communications.

[7]  Rui Zhang,et al.  Placement Optimization of UAV-Mounted Mobile Base Stations , 2016, IEEE Communications Letters.

[8]  Victor C. M. Leung,et al.  UAV Trajectory Optimization for Data Offloading at the Edge of Multiple Cells , 2018, IEEE Transactions on Vehicular Technology.

[9]  Halim Yanikomeroglu,et al.  Efficient 3D aerial base station placement considering users mobility by reinforcement learning , 2018, 2018 IEEE Wireless Communications and Networking Conference (WCNC).

[10]  Halim Yanikomeroglu,et al.  The New Frontier in RAN Heterogeneity: Multi-Tier Drone-Cells , 2016, IEEE Communications Magazine.

[11]  Qingqing Wu,et al.  Joint Trajectory and Communication Design for Multi-UAV Enabled Wireless Networks , 2017, IEEE Transactions on Wireless Communications.

[12]  Rui Zhang,et al.  Wireless communications with unmanned aerial vehicles: opportunities and challenges , 2016, IEEE Communications Magazine.