Transmit Power Pool Design for Grant-Free NOMA-IoT Networks via Deep Reinforcement Learning

Grant-free non-orthogonal multiple access (GF-NOMA) is a potential multiple access framework for short-packet internet-of-things (IoT) networks to enhance connectivity. However, the resource allocation problem in GF-NOMA is challenging due to the absence of closed-loop power control. We design a prototype of transmit power pool (PP) to provide open-loop power control. IoT users acquire their transmit power in advance from this prototype PP solely according to their communication distances. Firstly, a multi-agent deep Q-network (DQN) aided GF-NOMA algorithm is proposed to determine the optimal transmit power levels for the prototype PP. More specifically, each IoT user acts as an agent and learns a policy by interacting with the wireless environment that guides them to select optimal actions. Secondly, to prevent the Q-learning model overestimation problem, double DQN (DDQN) based GF-NOMA algorithm is proposed. Numerical results confirm that the DDQN based algorithm finds out the optimal transmit power levels that form the PP. Comparing with the conventional online learning approach, the proposed algorithm with the prototype PP converges faster under changing environments due to limiting the action space based on previous learning. The considered GF-NOMA system outperforms the networks with fixed transmission power, namely all the users have the same transmit power and the traditional GF with orthogonal multiple access techniques, in terms of throughput.

[1]  Arumugam Nallanathan,et al.  Transmit Power Pool Design for Uplink IoT Networks with Grant-free NOMA , 2021, ICC 2021 - IEEE International Conference on Communications.

[2]  Walid Saad,et al.  Sum Rate and Reliability Analysis for Power-Domain Nonorthogonal Multiple Access (PD-NOMA) , 2021, IEEE Internet of Things Journal.

[3]  Ying-Chang Liang,et al.  Reconfigurable Intelligent Surface-Assisted Non-Orthogonal Multiple Access , 2021, IEEE Transactions on Wireless Communications.

[4]  Fu-Chun Zheng,et al.  DRL-Based Energy-Efficient Resource Allocation Frameworks for Uplink NOMA Systems , 2020, IEEE Internet of Things Journal.

[5]  Zhijin Qin,et al.  Resource Allocation in Uplink NOMA-IoT Networks: A Reinforcement-Learning Approach , 2020, IEEE Transactions on Wireless Communications.

[6]  Chao Zhang,et al.  Semi-Grant-Free NOMA: A Stochastic Geometry Model , 2020, IEEE Transactions on Wireless Communications.

[7]  Yue Gao,et al.  Multi-agent reinforcement learning for resource allocation in IoT networks with edge computing , 2020, China Communications.

[8]  Zhijin Qin,et al.  Coverage Analysis of mmWave Networks With Cooperative NOMA Systems , 2020, IEEE Communications Letters.

[9]  Erik G. Larsson,et al.  Massive Access for 5G and Beyond , 2020, IEEE Journal on Selected Areas in Communications.

[10]  Weidang Lu,et al.  Security Enhancement for NOMA-UAV Networks , 2020, IEEE Transactions on Vehicular Technology.

[11]  Huici Wu,et al.  Deep Reinforcement Learning for Throughput Improvement of the Uplink Grant-Free NOMA System , 2020, IEEE Internet of Things Journal.

[12]  Zhijin Qin,et al.  Resource Allocation in Intelligent Reflecting Surface Assisted NOMA Systems , 2020, IEEE Transactions on Communications.

[13]  V. Cevher,et al.  Optimization for Reinforcement Learning: From a single agent to cooperative agents , 2019, IEEE Signal Processing Magazine.

[14]  N. Al-Dhahir,et al.  Exploiting Intelligent Reflecting Surfaces in NOMA Networks: Joint Beamforming Optimization , 2019, IEEE Transactions on Wireless Communications.

[15]  Sarah J. Johnson,et al.  Grant-Free Non-Orthogonal Multiple Access for IoT: A Survey , 2019, IEEE Communications Surveys & Tutorials.

[16]  Halim Yanikomeroglu,et al.  A Survey of Rate-Optimal Power Domain NOMA With Enabling Technologies of Future Wireless Networks , 2019, IEEE Communications Surveys & Tutorials.

[17]  Minh-Nghia Nguyen,et al.  Non-Cooperative Energy Efficient Power Allocation Game in D2D Communication: A Multi-Agent Deep Reinforcement Learning Approach , 2019, IEEE Access.

[18]  Ekram Hossain,et al.  A Deep Q-Learning Method for Downlink Power Allocation in Multi-Cell Networks , 2019, ArXiv.

[19]  Branka Vucetic,et al.  A Novel Analytical Framework for Massive Grant-Free NOMA , 2019, IEEE Transactions on Communications.

[20]  George K. Karagiannidis,et al.  A Unified Spatial Framework for UAV-Aided MmWave Networks , 2019, IEEE Transactions on Communications.

[21]  Haibo He,et al.  Distributive Dynamic Spectrum Access Through Deep Reinforcement Learning: A Reservoir Computing-Based Approach , 2018, IEEE Internet of Things Journal.

[22]  Arumugam Nallanathan,et al.  Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks , 2018, IEEE Transactions on Wireless Communications.

[23]  Ying-Chang Liang,et al.  Applications of Deep Reinforcement Learning in Communications and Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[24]  Yue Gao,et al.  UAV Communications Based on Non-Orthogonal Multiple Access , 2018, IEEE Wireless Communications.

[25]  Mehdi Bennis,et al.  Multi-Tenant Cross-Slice Resource Orchestration: A Deep Reinforcement Learning Approach , 2018, IEEE Journal on Selected Areas in Communications.

[26]  Bhaskar Krishnamachari,et al.  Deep Reinforcement Learning for Dynamic Multichannel Access in Wireless Networks , 2018, IEEE Transactions on Cognitive Communications and Networking.

[27]  Lajos Hanzo,et al.  Nonorthogonal Multiple Access for 5G and Beyond , 2017, Proceedings of the IEEE.

[28]  Jinho Choi,et al.  NOMA-Based Random Access With Multichannel ALOHA , 2017, IEEE Journal on Selected Areas in Communications.

[29]  Sarah J. Johnson,et al.  On the Fundamental Limits of Random Non-Orthogonal Multiple Access in Cellular Massive IoT , 2017, IEEE Journal on Selected Areas in Communications.

[30]  Yan Chen,et al.  PoC of SCMA-Based Uplink Grant-Free Transmission in UCNC for 5G , 2017, IEEE Journal on Selected Areas in Communications.

[31]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[32]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[33]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[34]  Alagan Anpalagan,et al.  Dynamic Spectrum Access in Time-Varying Environment: Distributed Learning Beyond Expectation Optimization , 2015, IEEE Transactions on Communications.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[37]  Goncalo Neto,et al.  From Single-Agent to Multi-Agent Reinforcement Learning: Foundational Concepts and Methods , 2005 .

[38]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.