A Dynamic Power Allocation Scheme in Power-Domain NOMA using Actor-Critic Reinforcement Learning

Non-orthogonal multiple access (NOMA) is one of the most promising technologies in the next-generation cellular communication. However, the effective power allocation strategy has always been a problem that needs to be solved in power-domain NOMA. In this paper, we propose a reinforcement learning (RL) method to solve the power allocation problem. In particular, in the power-domain NOMA, the base station (BS) simultaneously transmits data to the user under the constraint of the sum power. Considering that the power allocation assigned by the BS to each user can be used to optimize the energy efficient (EE) of the entire system, we propose the RL algorithm framework of the Actor-Critic to dynamically select the power allocation coefficient. A parameterized strategy is constructed in the Actor part, and then the Critic part evaluates it, and finally the Actor part adjust the strategy according to the feedback from the Critic part. Numerical results indicate that the proposed scheme can efficiently improve the EE of the entire system.

[1]  Zhengang Pan,et al.  Energy efficiency optimization for fading MIMO non-orthogonal multiple access systems , 2015, 2015 IEEE International Conference on Communications (ICC).

[2]  Ismail Güvenç,et al.  Learning Based Frequency- and Time-Domain Inter-Cell Interference Coordination in HetNets , 2014, IEEE Transactions on Vehicular Technology.

[3]  Jinho Choi,et al.  Power Allocation for Max-Sum Rate and Max-Min Rate Proportional Fairness in NOMA , 2016, IEEE Communications Letters.

[4]  Tiejun Lv,et al.  Energy Efficient Resource Allocation in Multi-User Downlink Non-Orthogonal Multiple Access Systems , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[5]  Zhu Han,et al.  User Scheduling and Resource Allocation in HetNets With Hybrid Energy Supply: An Actor-Critic Reinforcement Learning Approach , 2018, IEEE Transactions on Wireless Communications.

[6]  Huiming Wang,et al.  Energy-Efficient Transmission Design in Non-orthogonal Multiple Access , 2016, IEEE Transactions on Vehicular Technology.

[7]  Pingzhi Fan,et al.  On the Performance of Non-Orthogonal Multiple Access in 5G Systems with Randomly Deployed Users , 2014, IEEE Signal Processing Letters.

[8]  Xianfu Chen,et al.  Energy-Efficiency Oriented Traffic Offloading in Wireless Networks: A Brief Survey and a Learning Approach for Heterogeneous Cellular Networks , 2015, IEEE Journal on Selected Areas in Communications.

[9]  Robert W. Heath,et al.  Five disruptive technology directions for 5G , 2013, IEEE Communications Magazine.

[10]  Fei Hu,et al.  Intelligent Spectrum Management Based on Transfer Actor-Critic Learning for Rateless Transmissions in Cognitive Radio Networks , 2018, IEEE Transactions on Mobile Computing.

[11]  Robert Babuska,et al.  A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).