SREC: Proactive Self-Remedy of Energy-Constrained UAV-Based Networks via Deep Reinforcement Learning

Energy-aware control for multiple unmanned aerial vehicles (UAVs) is one of the major research interests in UAV based networking. Yet few existing works have focused on how the network should react around the timing when the UAV lineup is changed. In this work, we study proactive self-remedy of energy-constrained UAV networks when one or more UAVs are short of energy and about to quit for charging. We target at an energy-aware optimal UAV control policy which proactively relocates the UAVs when any UAV is about to quit the network, rather than passively dispatches the remaining UAVs after the quit. Specifically, a deep reinforcement learning (DRL)-based self remedy approach, named SREC-DRL, is proposed to maximize the accumulated user satisfaction scores for a certain period within which at least one UAV will quit the network. To handle the continuous state and action space in the problem, the state-of-the-art algorithm of the actor-critic DRL, i.e., deep deterministic policy gradient (DDPG), is applied with better convergence stability. Numerical results demonstrate that compared with the passive reaction method, the proposed SREC-DRL approach shows a $12.12\%$ gain in accumulative user satisfaction score during the remedy period.

[1]  Ismail Güvenç,et al.  Distributed Approaches for Inter-Cell Interference Coordination in UAV-Based LTE-Advanced HetNets , 2018, 2018 IEEE 88th Vehicular Technology Conference (VTC-Fall).

[2]  Robert Babuska,et al.  A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[3]  Lingyang Song,et al.  UAV Offloading: Spectrum Trading Contract Design for UAV Assisted 5G Networks , 2017, ArXiv.

[4]  Simon Newman,et al.  Basic Helicopter Aerodynamics , 1990 .

[5]  Paulo Valente Klaine,et al.  Distributed Drone Base Station Positioning for Emergency Cellular Networks Using Reinforcement Learning , 2018, Cognitive Computation.

[6]  Ran Zhang,et al.  Reinforcement Learning for Efficient and Fair Coexistence Between LTE-LAA and Wi-Fi , 2020, IEEE Transactions on Vehicular Technology.

[7]  Chi Harold Liu,et al.  Energy-Efficient UAV Control for Effective and Fair Communication Coverage: A Deep Reinforcement Learning Approach , 2018, IEEE Journal on Selected Areas in Communications.

[8]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[9]  Tom Schaul,et al.  Deep Q-learning From Demonstrations , 2017, AAAI.

[10]  Qingqing Wu,et al.  Joint Trajectory and Communication Design for Multi-UAV Enabled Wireless Networks , 2017, IEEE Transactions on Wireless Communications.

[11]  Changchuan Yin,et al.  Optimized Trajectory Design in UAV Based Cellular Networks for 3D Users: A Double Q-Learning Approach , 2019, J. Commun. Inf. Networks.

[12]  Ying-Chang Liang,et al.  Applications of Deep Reinforcement Learning in Communications and Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[13]  Xuemin Shen,et al.  UAV Deployment Strategy for Range-Based Space-Air Integrated Localization Network , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[14]  Weihua Zhuang,et al.  Reinforcement Learning-Based Computing and Transmission Scheduling for LTE-U-Enabled IoT , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[15]  Derrick Wing Kwan Ng,et al.  Optimal 3D-Trajectory Design and Resource Allocation for Solar-Powered UAV Communication Systems , 2018, IEEE Transactions on Communications.

[16]  Rui Zhang,et al.  Wireless communications with unmanned aerial vehicles: opportunities and challenges , 2016, IEEE Communications Magazine.