论文信息 - When Multiple Agents Learn to Schedule: A Distributed Radio Resource Management Framework

When Multiple Agents Learn to Schedule: A Distributed Radio Resource Management Framework

Interference among concurrent transmissions in a wireless network is a key factor limiting the system performance. One way to alleviate this problem is to manage the radio resources in order to maximize either the average or the worst-case performance. However, joint consideration of both metrics is often neglected as they are competing in nature. In this article, a mechanism for radio resource management using multi-agent deep reinforcement learning (RL) is proposed, which strikes the right trade-off between maximizing the average and the $5^{th}$ percentile user throughput. Each transmitter in the network is equipped with a deep RL agent, receiving partial observations from the network (e.g., channel quality, interference level, etc.) and deciding whether to be active or inactive at each scheduling interval for given radio resources, a process referred to as link scheduling. Based on the actions of all agents, the network emits a reward to the agents, indicating how good their joint decisions were. The proposed framework enables the agents to make decisions in a distributed manner, and the reward is designed in such a way that the agents strive to guarantee a minimum performance, leading to a fair resource allocation among all users across the network. Simulation results demonstrate the superiority of our approach compared to decentralized baselines in terms of average and $5^{th}$ percentile user throughput, while achieving performance close to that of a centralized exhaustive search approach. Moreover, the proposed framework is robust to mismatches between training and testing scenarios. In particular, it is shown that an agent trained on a network with low transmitter density maintains its performance and outperforms the baselines when deployed in a network with a higher transmitter density.

Meryem Simsek | Navid Naderializadeh | Hosein Nikopour | Shilpa Talwar | Jaroslaw Sydir

[1] Zhi-Quan Luo,et al. An iteratively weighted MMSE approach to distributed sum-utility maximization for a MIMO interfering broadcast channel , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2] David Gesbert,et al. Binary Power Control for Sum Rate Maximization over Multiple Interfering Links , 2008, IEEE Transactions on Wireless Communications.

[3] Alejandro Ribeiro,et al. Learning Optimal Resource Allocations in Wireless Systems , 2018, IEEE Transactions on Signal Processing.

[4] Xiaojing Huang,et al. The simulation of independent Rayleigh faders , 2002, IEEE Trans. Commun..

[5] Wei Yu,et al. FPLinQ: A cooperative spectrum sharing strategy for device-to-device communications , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[6] Dongning Guo,et al. Deep Reinforcement Learning for Distributed Dynamic Power Allocation in Wireless Networks , 2018, ArXiv.

[7] Meryem Simsek,et al. Dynamic Inter-Cell Interference Coordination in HetNets: A reinforcement learning approach , 2012, 2012 IEEE Global Communications Conference (GLOBECOM).

[8] A. Salman Avestimehr,et al. ITLinQ: A new approach for spectrum sharing in device-to-device communication systems , 2014, ISIT.

[9] Jens Zander,et al. Interference aware self-organization for wireless sensor networks: A reinforcement learning approach , 2008, 2008 IEEE International Conference on Automation Science and Engineering.

[10] Jeffrey G. Andrews,et al. Downlink Cellular Network Analysis With Multi-Slope Path Loss Models , 2014, IEEE Transactions on Communications.

[11] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[12] Jonathan P. How,et al. Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability , 2017, ICML.

[13] Ekram Hossain,et al. A Deep Q-Learning Method for Downlink Power Allocation in Multi-Cell Networks , 2019, ArXiv.

[14] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[15] Dongning Guo,et al. Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless Networks , 2018, IEEE Journal on Selected Areas in Communications.