3D UAV Trajectory and Data Collection Optimisation Via Deep Reinforcement Learning

Unmanned aerial vehicles (UAVs) are now beginning to be deployed for enhancing the network performance and coverage in wireless communication. However, due to the limitation of their on-board power and flight time, it is challenging to obtain an optimal resource allocation scheme for the UAVassisted Internet of Things (IoT). In this paper, we design a new UAV-assisted IoT systems relying on the shortest flight path of the UAVs while maximising the amount of data collected from IoT devices. Then, a deep reinforcement learning-based technique is conceived for finding the optimal trajectory and throughput in a specific coverage area. After training, the UAV has the ability to autonomously collect all the data from user nodes at a significant total sum-rate improvement while minimising the associated resources used. Numerical results are provided to highlight how our techniques strike a balance between the throughput attained, trajectory, and the time spent. More explicitly, we characterise the attainable performance in terms of the UAV trajectory, the expected reward and the total sum-rate. KeywordsUAV-assisted wireless network, trajectory, data collection, and deep reinforcement learning.

[1]  Walid Saad,et al.  Interference Management for Cellular-Connected UAVs: A Deep Reinforcement Learning Approach , 2018, IEEE Transactions on Wireless Communications.

[2]  Minh-Nghia Nguyen,et al.  Non-Cooperative Energy Efficient Power Allocation Game in D2D Communication: A Multi-Agent Deep Reinforcement Learning Approach , 2019, IEEE Access.

[3]  Lajos Hanzo,et al.  Charging Unplugged: Will Distributed Laser Charging for Mobile Wireless Power Transfer Work? , 2016, IEEE Vehicular Technology Magazine.

[4]  A. Lee Swindlehurst,et al.  3D UAV Trajectory and Communication Design for Simultaneous Uplink and Downlink Transmission , 2020, IEEE Transactions on Communications.

[5]  Yiwei Zhang,et al.  Reinforcement Mechanism Design for Fraudulent Behaviour in e-Commerce , 2018, AAAI.

[6]  Long D. Nguyen,et al.  Distributed Deep Deterministic Policy Gradient for Power Allocation Control in D2D-Based V2V Communications , 2019, IEEE Access.

[7]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[8]  Ali Ghrayeb,et al.  Age of Information Aware Trajectory Planning of UAVs in Intelligent Transportation Systems: A Deep Learning Approach , 2020, IEEE Transactions on Vehicular Technology.

[9]  Shuowen Zhang,et al.  Multi-Beam UAV Communication in Cellular Uplink: Cooperative Interference Cancellation and Sum-Rate Maximization , 2018, IEEE Transactions on Wireless Communications.

[10]  Haitao Zhao,et al.  Joint Optimization on Trajectory, Altitude, Velocity, and Link Scheduling for Minimum Mission Time in UAV-Aided Data Collection , 2020, IEEE Internet of Things Journal.

[11]  Trung Quang Duong,et al.  An Introduction of Real-time Embedded Optimisation Programming for UAV Systems under Disaster Communication , 2018, EAI Endorsed Trans. Ind. Networks Intell. Syst..

[12]  H. D. Tuan,et al.  Joint D2D Assignment, Bandwidth and Power Allocation in Cognitive UAV-Enabled Networks , 2020, IEEE Transactions on Cognitive Communications and Networking.

[13]  Chao Shen,et al.  Flight Time Minimization of UAV for Data Collection Over Wireless Sensor Networks , 2018, IEEE Journal on Selected Areas in Communications.

[14]  Hoang Duong Tuan,et al.  Real-Time Optimal Resource Allocation for Embedded UAV Communication Systems , 2018, IEEE Wireless Communications Letters.

[15]  Xiao Liu,et al.  Reinforcement Learning in Multiple-UAV Networks: Deployment and Movement Design , 2019, IEEE Transactions on Vehicular Technology.

[16]  Huici Wu,et al.  Cell-Edge User Offloading via Flying UAV in Non-Uniform Heterogeneous Cellular Networks , 2020, IEEE Transactions on Wireless Communications.

[17]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[18]  Long D. Nguyen,et al.  Real-Time Energy Harvesting Aided Scheduling in UAV-Assisted D2D Networks Relying on Deep Reinforcement Learning , 2021, IEEE Access.

[19]  Hoang Duong Tuan,et al.  Learning-Aided Realtime Performance Optimisation of Cognitive UAV-Assisted Disaster Communication , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[20]  Eduardo Tovar,et al.  On-Board Deep Q-Network for UAV-Assisted Online Power Transfer and Data Collection , 2019, IEEE Transactions on Vehicular Technology.

[21]  Jin Chen,et al.  Unmanned Aerial Vehicle-Aided Communications: Joint Transmit Power and Trajectory Optimization , 2018, IEEE Wireless Communications Letters.

[22]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Yiyang Pei,et al.  Deep Reinforcement Learning for User Association and Resource Allocation in Heterogeneous Cellular Networks , 2019, IEEE Transactions on Wireless Communications.

[25]  Walid Saad,et al.  Unmanned Aerial Vehicle With Underlaid Device-to-Device Communications: Performance and Tradeoffs , 2015, IEEE Transactions on Wireless Communications.

[26]  Holger Claussen Distributed Algorithms for Robust Self-deployment and Load Balancing in Autonomous Wireless Access Networks , 2006, 2006 IEEE International Conference on Communications.

[27]  Chadi Assi,et al.  UAV Trajectory Planning for Data Collection from Time-Constrained IoT Devices , 2020, IEEE Transactions on Wireless Communications.

[28]  Herbert Jaeger,et al.  The''echo state''approach to analysing and training recurrent neural networks , 2001 .

[29]  Zhu Han,et al.  Completion Time Minimization With Path Planning for Fixed-Wing UAV Communications , 2019, IEEE Transactions on Wireless Communications.

[30]  Rongke Liu,et al.  Energy-Efficient Data Collection and Device Positioning in UAV-Assisted IoT , 2020, IEEE Internet of Things Journal.

[31]  Xi-wei Xu,et al.  High-resolution mapping based on an Unmanned Aerial Vehicle (UAV) to capture paleoseismic offsets along the Altyn-Tagh fault, China , 2016, Scientific Reports.

[32]  Lajos Hanzo,et al.  Federated Learning Assisted Multi-UAV Networks , 2020, IEEE Transactions on Vehicular Technology.

[33]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[34]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Long D. Nguyen,et al.  Practical Optimisation of Path Planning and Completion Time of Data Collection for UAV-enabled Disaster Communications , 2019, 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC).

[36]  Qingqing Wu,et al.  Joint Trajectory and Communication Design for Multi-UAV Enabled Wireless Networks , 2017, IEEE Transactions on Wireless Communications.

[37]  Soung Chang Liew,et al.  Deep-Reinforcement Learning Multiple Access for Heterogeneous Wireless Networks , 2017, 2018 IEEE International Conference on Communications (ICC).

[38]  Xiao Liu,et al.  Trajectory Design and Power Control for Multi-UAV Assisted Wireless Networks: A Machine Learning Approach , 2018, IEEE Transactions on Vehicular Technology.

[39]  Qingqing Wu,et al.  Energy Tradeoff in Ground-to-UAV Communication via Trajectory Design , 2017, IEEE Transactions on Vehicular Technology.

[40]  F. Richard Yu,et al.  Intelligent Trajectory Design in UAV-Aided Communications With Reinforcement Learning , 2019, IEEE Transactions on Vehicular Technology.

[41]  Ayse Kortun,et al.  Real-Time Deployment and Resource Allocation for Distributed UAV Systems in Disaster Relief , 2019, 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[42]  Senem Velipasalar,et al.  Deep Reinforcement Learning-Based Edge Caching in Wireless Networks , 2020, IEEE Transactions on Cognitive Communications and Networking.

[43]  Fumiyuki Adachi,et al.  Deep Reinforcement Learning for UAV Navigation Through Massive MIMO Technique , 2019, IEEE Transactions on Vehicular Technology.

[44]  Yuan Shen,et al.  Autonomous Navigation of UAVs in Large-Scale Complex Environments: A Deep Reinforcement Learning Approach , 2019, IEEE Transactions on Vehicular Technology.

[45]  Lajos Hanzo,et al.  A Near-Optimal UAV-Aided Radio Coverage Strategy for Dense Urban Areas , 2019, IEEE Transactions on Vehicular Technology.

[46]  Yong Zeng,et al.  Aerial–Ground Cost Tradeoff for Multi-UAV-Enabled Data Collection in Wireless Sensor Networks , 2020, IEEE Transactions on Communications.

[47]  Rui Zhang,et al.  Energy-Efficient Data Collection in UAV Enabled Wireless Sensor Network , 2017, IEEE Wireless Communications Letters.

[48]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[49]  Jie Xu,et al.  Throughput Maximization for UAV-Enabled Wireless Powered Communication Networks , 2018, IEEE Internet of Things Journal.