LSTM-Characterized Deep Reinforcement Learning for Continuous Flight Control and Resource Allocation in UAV-Assisted Sensor Network

Unmanned aerial vehicles (UAVs) can be employed to collect sensory data in remote wireless sensor networks (WSN). Due to UAV’s maneuvering, scheduling a sensor device to transmit data can overflow data buffers of the unscheduled ground devices. Moreover, lossy airborne channels can result in packet reception errors at the scheduled sensor. This paper proposes a new deep reinforcement learning based flight resource allocation framework (DeFRA) to minimize the overall data packet loss in a continuous action space. DeFRA is based on Deep Deterministic Policy Gradient (DDPG), optimally controls instantaneous headings and speeds of the UAV, and selects the ground device for data collection. Furthermore, a state characterization layer, leveraging long short-term memory (LSTM), is developed to predict network dynamics, resulting from time-varying airborne channels and energy arrivals at the ground devices. To validate the effectiveness of DeFRA, experimental data collected from a real-world UAV testbed and energy harvesting WSN are utilized to train the actions of the UAV. Numerical results demonstrate that the proposed DeFRA achieves a fast convergence while reducing the packet loss by over 15%, as compared to existing deep reinforcement learning solutions.

[1]  Hakim Ghazzai,et al.  Joint Position and Travel Path Optimization for Energy Efficient Wireless Data Gathering Using Unmanned Aerial Vehicles , 2019, IEEE Transactions on Vehicular Technology.

[2]  Azer Bestavros,et al.  Reinforcement Learning for UAV Attitude Control , 2018, ACM Trans. Cyber Phys. Syst..

[3]  Eduardo Tovar,et al.  Online Velocity Control and Data Capture of Drones for the Internet of Things: An Onboard Deep Reinforcement Learning Approach , 2021, IEEE Vehicular Technology Magazine.

[4]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[5]  Zhu Han,et al.  UAV-Enabled Secure Communications by Multi-Agent Deep Reinforcement Learning , 2020, IEEE Transactions on Vehicular Technology.

[6]  Eduardo Tovar,et al.  Buffer-Aware Scheduling for UAV Relay Networks with Energy Fairness , 2020, 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring).

[7]  Youngnam Han,et al.  Optimal UAV Route in Wireless Charging Sensor Networks , 2020, IEEE Internet of Things Journal.

[8]  Huaiyu Dai,et al.  UAV-Enabled Age-Optimal Data Collection in Wireless Sensor Networks , 2019, 2019 IEEE International Conference on Communications Workshops (ICC Workshops).

[9]  Yuehong Gao,et al.  On Buffer-Constrained Throughput of a Wireless-Powered Communication System , 2019, IEEE Journal on Selected Areas in Communications.

[10]  Luiz A. DaSilva,et al.  UAVs as Mobile Infrastructure: Addressing Battery Lifetime , 2018, IEEE Communications Magazine.

[11]  Petros Spachos,et al.  Integration of Wireless Sensor Networks and Smart UAVs for Precision Viticulture , 2019, IEEE Internet Computing.

[12]  Eduardo Tovar,et al.  Joint Flight Cruise Control and Data Collection in UAV-Aided Internet of Things: An Onboard Deep Reinforcement Learning Approach , 2020, IEEE Internet of Things Journal.

[13]  Liuqing Yang,et al.  UAV-Assisted Data Collection With Nonorthogonal Multiple Access , 2021, IEEE Internet of Things Journal.

[14]  Zhu Han,et al.  Data Freshness and Energy-Efficient UAV Navigation Optimization: A Deep Reinforcement Learning Approach , 2020, IEEE Transactions on Intelligent Transportation Systems.

[15]  Eduardo Tovar,et al.  Onboard Double Q-Learning for Airborne Data Capture in Wireless Powered IoT Networks , 2020, IEEE Networking Letters.

[16]  Purushottam Kulkarni,et al.  Energy Harvesting Sensor Nodes: Survey and Implications , 2011, IEEE Communications Surveys & Tutorials.

[17]  Eduardo Tovar,et al.  Poster Abstract: Multi-Drone Assisted Internet of Things Testbed Based on Bluetooth 5 Communications , 2020, 2020 19th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN).

[18]  Wei Li,et al.  UAV Network and IoT in the Sky for Future Smart Cities , 2019, IEEE Network.

[19]  Zainul Abdin Jaffery,et al.  Solar energy harvesting wireless sensor network nodes: A survey , 2018 .

[20]  Chau Yuen,et al.  Spatial and Temporal Analysis of Urban Space Utilization with Renewable Wireless Sensor Network , 2016, 2016 IEEE/ACM 3rd International Conference on Big Data Computing Applications and Technologies (BDCAT).

[21]  Haitao Zhao,et al.  Joint Optimization on Trajectory, Altitude, Velocity, and Link Scheduling for Minimum Mission Time in UAV-Aided Data Collection , 2020, IEEE Internet of Things Journal.

[22]  Chi Harold Liu,et al.  Energy-Efficient UAV Control for Effective and Fair Communication Coverage: A Deep Reinforcement Learning Approach , 2018, IEEE Journal on Selected Areas in Communications.

[23]  Kandeepan Sithamparanathan,et al.  Optimal LAP Altitude for Maximum Coverage , 2014, IEEE Wireless Communications Letters.

[24]  Jie Xu,et al.  Common Throughput Maximization for UAV-Enabled Interference Channel With Wireless Powered Communications , 2019, IEEE Transactions on Communications.

[25]  Alagan Anpalagan,et al.  Opportunistic UAV Utilization in Wireless Networks: Motivations, Applications, and Challenges , 2020, IEEE Communications Magazine.

[26]  Kai Li,et al.  Continuous Maneuver Control and Data Capture Scheduling of Autonomous Drone in Wireless Sensor Networks , 2022, IEEE Transactions on Mobile Computing.

[27]  Jie Xu,et al.  Energy Minimization for Wireless Communication With Rotary-Wing UAV , 2018, IEEE Transactions on Wireless Communications.

[28]  Tarachand Amgoth,et al.  Renewable energy harvesting schemes in wireless sensor networks: A Survey , 2020, Inf. Fusion.

[29]  Changsheng You,et al.  Hybrid Offline-Online Design for UAV-Enabled Data Harvesting in Probabilistic LoS Channels , 2019, IEEE Transactions on Wireless Communications.