Optimization for Master-UAV-Powered Auxiliary-Aerial-IRS-Assisted IoT Networks: An Option-Based Multi-Agent Hierarchical Deep Reinforcement Learning Approach

This paper investigates a master unmanned aerial vehicle (MUAV)-powered Internet of Things (IoT) network, in which we propose using a rechargeable auxiliary UAV (AUAV) equipped with an intelligent reflecting surface (IRS) to enhance the communication signals from the MUAV and also leverage the MUAV as a recharging power source. Under the proposed model, we investigate the optimal collaboration strategy of these energylimited UAVs to maximize the accumulated throughput of the IoT network. Depending on whether there is charging between the two UAVs, two optimization problems are formulated. To solve them, two multi-agent deep reinforcement learning (DRL) approaches are proposed, which are centralized training multiagent deep deterministic policy gradient (CT-MADDPG) and multi-agent deep deterministic policy option critic (MADDPOC). It is shown that the CT-MADDPG can greatly reduce the requirement on the computing capability of the UAV hardware, and the proposed MADDPOC is able to support low-level multi-agent cooperative learning in the continuous action domains, which has great advantages over the existing option-based hierarchical DRL that only support single-agent learning and discrete actions.

[1]  Saraju P. Mohanty,et al.  Everything You Wanted to Know About Smart Cities , 2016, IEEE Consumer Electron. Mag..

[2]  Lajos Hanzo,et al.  Multicell MIMO Communications Relying on Intelligent Reflecting Surfaces , 2019, IEEE Transactions on Wireless Communications.

[3]  Saeid Nahavandi,et al.  Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications , 2018, ArXiv.

[4]  Qihui Wu,et al.  An Amateur Drone Surveillance System Based on the Cognitive Internet of Things , 2017, IEEE Communications Magazine.

[5]  Yu Wang,et al.  The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games , 2021, NeurIPS.

[6]  Xiaojun Yuan,et al.  Robust Secure UAV Communications With the Aid of Reconfigurable Intelligent Surfaces , 2020, IEEE Transactions on Wireless Communications.

[7]  Xuefeng Zhai,et al.  A wireless charging method with lightweight pick-up structure for UAVs , 2021, Electrical Engineering.

[8]  Ying-Chang Liang,et al.  Riding on the Primary: A New Spectrum Sharing Paradigm for Wireless-Powered IoT Devices , 2018, IEEE Transactions on Wireless Communications.

[9]  Youssef Iraqi,et al.  On the Performance of IRS-Assisted Multi-Layer UAV Communications With Imperfect Phase Compensation , 2021, IEEE Transactions on Communications.

[10]  Derrick Wing Kwan Ng,et al.  Resource Allocation for Power-Efficient IRS-Assisted UAV Communications , 2020, 2020 IEEE International Conference on Communications Workshops (ICC Workshops).

[11]  Kandeepan Sithamparanathan,et al.  Optimal LAP Altitude for Maximum Coverage , 2014, IEEE Wireless Communications Letters.

[12]  Xuemin Shen,et al.  3D UAV Trajectory Design and Frequency Band Allocation for Energy-Efficient and Fair Communication: A Deep Reinforcement Learning Approach , 2020, IEEE Transactions on Wireless Communications.

[13]  Yixin Yan,et al.  Design of UAV wireless power transmission system based on coupling coil structure optimization , 2020, EURASIP J. Wirel. Commun. Netw..

[14]  Youngnam Han,et al.  Energy-Efficient UAV Routing for Wireless Sensor Networks , 2020, IEEE Transactions on Vehicular Technology.

[15]  Mengqi Li,et al.  Minimizing Energy Consumption in Wireless Rechargeable UAV Networks , 2022, IEEE Internet of Things Journal.

[16]  Zhiyu Mou,et al.  Deep Reinforcement Learning Based Three-Dimensional Area Coverage With UAV Swarm , 2021, IEEE Journal on Selected Areas in Communications.

[17]  Jie Chen,et al.  Large Intelligent Surface/Antennas (LISA): Making Reflective Radios Smart , 2019, J. Commun. Inf. Networks.

[18]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[19]  Jie Xu,et al.  Energy Minimization for Wireless Communication With Rotary-Wing UAV , 2018, IEEE Transactions on Wireless Communications.

[20]  Zhu Han,et al.  Hierarchical Deep Reinforcement Learning for Backscattering Data Collection With Multiple UAVs , 2021, IEEE Internet of Things Journal.

[21]  Chi Harold Liu,et al.  Distributed Energy-Efficient Multi-UAV Navigation for Long-Term Communication Coverage by Deep Reinforcement Learning , 2020, IEEE Transactions on Mobile Computing.

[22]  Doina Precup,et al.  The Option-Critic Architecture , 2016, AAAI.

[23]  Tieyan Li,et al.  On the Trust and Trust Modelling for the Future Fully-Connected Digital World: A Comprehensive Study , 2021, ArXiv.

[24]  Ying-Chang Liang,et al.  Optimization for Full-Duplex Rotary-Wing UAV-Enabled Wireless-Powered IoT Networks , 2020, IEEE Transactions on Wireless Communications.

[25]  Rui Zhang,et al.  Wireless communications with unmanned aerial vehicles: opportunities and challenges , 2016, IEEE Communications Magazine.

[26]  Derrick Wing Kwan Ng,et al.  Sum-Rate Maximization for IRS-Assisted UAV OFDMA Communication Systems , 2020, GLOBECOM 2020 - 2020 IEEE Global Communications Conference.

[27]  Shi Jin,et al.  Enabling Panoramic Full-Angle Reflection Via Aerial Intelligent Reflecting Surface , 2020, 2020 IEEE International Conference on Communications Workshops (ICC Workshops).

[28]  Doina Precup,et al.  When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.

[29]  Ekram Hossain,et al.  Optimization of Wireless Relaying With Flexible UAV-Borne Reflecting Surfaces , 2020, IEEE Transactions on Communications.

[30]  Ying-Chang Liang,et al.  Energy-Efficient UAV Backscatter Communication With Joint Trajectory Design and Resource Optimization , 2019, IEEE Transactions on Wireless Communications.

[31]  Qingqing Wu,et al.  Joint Trajectory and Communication Design for Multi-UAV Enabled Wireless Networks , 2017, IEEE Transactions on Wireless Communications.

[32]  Jing Jiang,et al.  Energy-efficient UAV trajectory design for backscatter communication: A deep reinforcement learning approach , 2020, China Communications.

[33]  Yasir Mehmood,et al.  Internet-of-Things-Based Smart Cities: Recent Advances and Challenges , 2017, IEEE Communications Magazine.

[34]  Haixia Zhang,et al.  Joint Beamforming and Phase Shift Design in Downlink UAV Networks with IRS-Assisted NOMA , 2020, J. Commun. Inf. Networks.

[35]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[36]  Ronghong Mo,et al.  Reconfigurable Intelligent Surface Assisted Multiuser MISO Systems Exploiting Deep Reinforcement Learning , 2020, IEEE Journal on Selected Areas in Communications.

[37]  Yimeng Ge,et al.  Beamforming Optimization for Intelligent Reflecting Surface Assisted MISO: A Deep Transfer Learning Approach , 2021, IEEE Transactions on Vehicular Technology.

[38]  Yu Cheng,et al.  Constrained Deep Reinforcement Learning for Energy Sustainable Multi-UAV Based Random Access IoT Networks With NOMA , 2020, IEEE Journal on Selected Areas in Communications.