Hierarchical Deep Reinforcement Learning for Backscattering Data Collection With Multiple UAVs

The emerging backscatter communication technology is recognized as a promising solution to the battery problem of Internet of Things (IoT) devices. For example, the wireless sensor network with backscatter communication technology can monitor the environment in remote areas without battery maintenance or replacement. Unfortunately, the transmission range of backscatter communication is limited. To tackle this challenge, we propose a multi-UAV-aided data collection scenario where the unmanned aerial vehicle (UAV) can fly close to the backscatter sensor node (BSN) to activate it and then collects the data. We aim to minimize the total flight time of the rechargeable UAVs when the collection mission is finished. During the data collection process, the UAVs can return to the charging station to recharge itself when the energy of UAV is not sufficient to complete the mission. To reduce the complexity of the task, we first use the Gaussian mixture model clustering method to divide the BSNs into multiple clusters. Then we consider the deterministic boundary and ambiguous boundary for the UAV flying regions, respectively. For the deterministic boundary scenario, we propose a single-agent deep option learning (SADOL) algorithm, where each UAV cannot fly beyond the deterministic boundary. For the ambiguous boundary scenario, we propose a multiagent deep option learning (MADOL) algorithm to enable the UAVs to cooperatively learn the ambiguous BSNs assignment. In the simulation, we compare the proposed algorithms with multiagent deep deterministic policy gradient (MADDPG), deep deterministic policy gradient (DDPG), and deep Q-network (DQN) algorithms, which proves the proposed algorithms can achieve better performance.

[1]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[2]  Wen Wu,et al.  Deep Reinforcement Learning for Router Selection in Network With Heavy Traffic , 2019, IEEE Access.

[3]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[4]  Chadi Assi,et al.  UAV Trajectory Planning for Data Collection from Time-Constrained IoT Devices , 2020, IEEE Transactions on Wireless Communications.

[5]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[6]  Jie Xu,et al.  Energy Minimization for Wireless Communication With Rotary-Wing UAV , 2018, IEEE Transactions on Wireless Communications.

[7]  Zhu Han,et al.  Optimization of MANET connectivity via smart deployment/movement of unmanned air vehicles , 2009, IEEE Transactions on Vehicular Technology.

[8]  Yong Zeng,et al.  Aerial–Ground Cost Tradeoff for Multi-UAV-Enabled Data Collection in Wireless Sensor Networks , 2020, IEEE Transactions on Communications.

[9]  Victor C. M. Leung,et al.  UAV Trajectory Optimization for Data Offloading at the Edge of Multiple Cells , 2018, IEEE Transactions on Vehicular Technology.

[10]  Lingyang Song,et al.  Cellular Cooperative Unmanned Aerial Vehicle Networks With Sense-and-Send Protocol , 2019, IEEE Internet of Things Journal.

[11]  Caijun Zhong,et al.  Robust Design for Intelligent Reflecting Surfaces Assisted MISO Systems , 2020, IEEE Communications Letters.

[12]  Huaiyu Dai,et al.  Deep Reinforcement Learning for Efficient Data Collection in UAV-Aided Internet of Things , 2020, 2020 IEEE International Conference on Communications Workshops (ICC Workshops).

[13]  Zhu Han,et al.  Spectrum Sharing Planning for Full-Duplex UAV Relaying Systems With Underlaid D2D Communications , 2018, IEEE Journal on Selected Areas in Communications.

[14]  Ting Su,et al.  In search of deterministic methods for initializing K-means and Gaussian mixture clustering , 2007, Intell. Data Anal..

[15]  Rui Zhang,et al.  Energy-Efficient Data Collection in UAV Enabled Wireless Sensor Network , 2017, IEEE Wireless Communications Letters.

[16]  Liang Liu,et al.  Towards Reliable UAV Swarm Communication in D2D-Enhanced Cellular Networks , 2020, IEEE Transactions on Wireless Communications.

[17]  Caijun Zhong,et al.  Location Information Aided Multiple Intelligent Reflecting Surface Systems , 2020, IEEE Transactions on Communications.

[18]  Halim Yanikomeroglu,et al.  Efficient 3-D placement of an aerial base station in next generation cellular networks , 2016, 2016 IEEE International Conference on Communications (ICC).

[19]  Kandeepan Sithamparanathan,et al.  Optimal LAP Altitude for Maximum Coverage , 2014, IEEE Wireless Communications Letters.

[20]  Chao Shen,et al.  Flight Time Minimization of UAV for Data Collection Over Wireless Sensor Networks , 2018, IEEE Journal on Selected Areas in Communications.

[21]  Halim Yanikomeroglu,et al.  Multi-UAV Data Collection Framework for Wireless Sensor Networks , 2019, 2019 IEEE Global Communications Conference (GLOBECOM).

[22]  Rui Zhang,et al.  3D Trajectory Optimization in Rician Fading for UAV-Enabled Data Harvesting , 2019, IEEE Transactions on Wireless Communications.

[23]  Yong Zeng,et al.  Completion Time Minimization for Multi-UAV-Enabled Data Collection , 2019, IEEE Transactions on Wireless Communications.

[24]  Sungrae Cho,et al.  Routing in Flying Ad Hoc Networks: A Comprehensive Survey , 2020, IEEE Communications Surveys & Tutorials.

[25]  Eduardo Tovar,et al.  On-Board Deep Q-Network for UAV-Assisted Online Power Transfer and Data Collection , 2019, IEEE Transactions on Vehicular Technology.

[26]  Qihui Wu,et al.  An Amateur Drone Surveillance System Based on the Cognitive Internet of Things , 2017, IEEE Communications Magazine.

[27]  Li Wang,et al.  Minimizing Packet Expiration Loss With Path Planning in UAV-Assisted Data Sensing , 2019, IEEE Wireless Communications Letters.

[28]  Yuan Ding,et al.  Energy Efficiency Optimization for UAV-Assisted Backscatter Communications , 2019, IEEE Communications Letters.

[29]  Weidang Lu,et al.  UAV-Assisted Emergency Networks in Disasters , 2019, IEEE Wireless Communications.

[30]  Zhu Han,et al.  A Robust Design for Ultra Reliable Ambient Backscatter Communication Systems , 2019, IEEE Internet of Things Journal.

[31]  Yik-Chung Wu,et al.  Backscatter Data Collection With Unmanned Ground Vehicle: Mobility Management and Power Allocation , 2019, IEEE Transactions on Wireless Communications.

[32]  Halim Yanikomeroglu,et al.  UAV Data Collection Over NOMA Backscatter Networks: UAV Altitude and Trajectory Optimization , 2019, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[33]  Walid Saad,et al.  Hedonic Coalition Formation for Distributed Task Allocation among Wireless Agents , 2010, IEEE Transactions on Mobile Computing.

[34]  Caijun Zhong,et al.  Programmable Metasurface-Based Multicast Systems: Design and Analysis , 2020, IEEE Journal on Selected Areas in Communications.

[35]  Tamer Basar,et al.  Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms , 2019, Handbook of Reinforcement Learning and Control.

[36]  SuTing,et al.  In search of deterministic methods for initializing K-means and Gaussian mixture clustering , 2007 .

[37]  Nikolaos Mitianoudis,et al.  Converting a Plant to a Battery and Wireless Sensor with Scatter Radio and Ultra-Low Cost , 2016, IEEE Transactions on Instrumentation and Measurement.

[38]  Haitao Zhao,et al.  Joint Optimization on Trajectory, Altitude, Velocity, and Link Scheduling for Minimum Mission Time in UAV-Aided Data Collection , 2020, IEEE Internet of Things Journal.

[39]  Lingyang Song,et al.  Cellular UAV-to-X Communications: Design and Optimization for Multi-UAV Networks , 2018, IEEE Transactions on Wireless Communications.

[40]  Shi Jin,et al.  Beam Tracking for UAV Mounted SatCom on-the-Move With Massive Antenna Array , 2017, IEEE Journal on Selected Areas in Communications.

[41]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[42]  Xiaodong Xu,et al.  Energy-Efficient UAV Trajectory Planning for Data Collection and Computation in mMTC Networks , 2018, 2018 IEEE Globecom Workshops (GC Wkshps).