Multi-Agent Deep Reinforcement Learning for Dynamic Power Allocation in Wireless Networks

This work demonstrates the potential of deep reinforcement learning techniques for transmit power control in wireless networks. Existing techniques typically find near-optimal power allocations by solving a challenging optimization problem. Most of these algorithms are not scalable to large networks in real-world scenarios because of their computational complexity and instantaneous cross-cell channel state information (CSI) requirement. In this paper, a distributively executed dynamic power allocation scheme is developed based on model-free deep reinforcement learning. Each transmitter collects CSI and quality of service (QoS) information from several neighbors and adapts its own transmit power accordingly. The objective is to maximize a weighted sum-rate utility function, which can be particularized to achieve maximum sum-rate or proportionally fair scheduling. Both random variations and delays in the CSI are inherently addressed using deep ${Q}$ -learning. For a typical network architecture, the proposed algorithm is shown to achieve near-optimal power allocation in real time based on delayed CSI measurements available to the agents. The proposed scheme is especially suitable for practical scenarios where the system model is inaccurate and CSI delay is non-negligible.

[1]  Xianfu Chen,et al.  Deep Reinforcement Learning for Resource Management in Network Slicing , 2018, IEEE Access.

[2]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[3]  Mung Chiang,et al.  Power Control in Wireless Cellular Networks , 2008, Found. Trends Netw..

[4]  Meryem Simsek,et al.  Improved decentralized Q-learning algorithm for interference reduction in LTE-femtocells , 2011, 2011 Wireless Advanced.

[5]  Aggelos K. Katsaggelos,et al.  Automatic feature design for regression , 2016 .

[6]  Michael L. Honig,et al.  Energy-Efficient Cell Activation, User Association, and Spectrum Allocation in Heterogeneous Networks , 2015, IEEE Journal on Selected Areas in Communications.

[7]  Zhi-Quan Luo,et al.  An iteratively weighted MMSE approach to distributed sum-utility maximization for a MIMO interfering broadcast channel , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Ana Galindo-Serrano,et al.  Distributed Q-Learning for Interference Control in OFDMA-Based Femtocell Networks , 2010, 2010 IEEE 71st Vehicular Technology Conference.

[9]  Lenan Wu,et al.  Power Allocation in Multi-User Cellular Networks with Deep Q Learning Approach , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[10]  Eytan Modiano,et al.  Dynamic power allocation and routing for time varying wireless networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[11]  Mehdi Bennis,et al.  A Q-learning based approach to interference avoidance in self-organized femtocell networks , 2010, 2010 IEEE Globecom Workshops.

[12]  David Gesbert,et al.  Maximizing Multicell Capacity Using Distributed Power Allocation and Scheduling , 2007, 2007 IEEE Wireless Communications and Networking Conference.

[13]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[14]  Geoffrey Ye Li,et al.  Deep Reinforcement Learning Based Resource Allocation for V2V Communications , 2018, IEEE Transactions on Vehicular Technology.

[15]  Jeffrey G. Andrews,et al.  Reinforcement Learning for Self Organization and Power Control of Two-Tier Heterogeneous Networks , 2018, IEEE Transactions on Wireless Communications.

[16]  Geoffrey Ye Li,et al.  Spectrum and Power Allocation for Vehicular Communications With Delayed CSI Feedback , 2017, IEEE Wireless Communications Letters.

[17]  Kobi Cohen,et al.  Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access , 2017, IEEE Transactions on Wireless Communications.

[18]  Saeid Nahavandi,et al.  Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications , 2018, IEEE Transactions on Cybernetics.

[19]  Illsoo Sohn Distributed Downlink Power Control by Message-Passing for Very Large-Scale Networks , 2015, Int. J. Distributed Sens. Networks.

[20]  Dorian Kodelja,et al.  Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.

[21]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[22]  Li Wang,et al.  Learning Radio Resource Management in 5G Networks: Framework, Opportunities and Challenges , 2016, ArXiv.

[23]  Ying-Chang Liang,et al.  Applications of Deep Reinforcement Learning in Communications and Networking: A Survey , 2018, IEEE Communications Surveys & Tutorials.

[24]  N. Sidiropoulos,et al.  Learning to Optimize: Training Deep Neural Networks for Interference Management , 2017, IEEE Transactions on Signal Processing.

[25]  Zhi-Quan Luo,et al.  Dynamic Spectrum Management: Complexity and Duality , 2008, IEEE Journal of Selected Topics in Signal Processing.

[26]  Ying-Chang Liang,et al.  Deep Reinforcement Learning-Based Modulation and Coding Scheme Selection in Cognitive Heterogeneous Networks , 2018, IEEE Transactions on Wireless Communications.

[27]  Mykel J. Kochenderfer,et al.  Cooperative Multi-agent Control Using Deep Reinforcement Learning , 2017, AAMAS Workshops.

[28]  Jim Kurose,et al.  Computer Networking: A Top-Down Approach , 1999 .

[29]  David Tse,et al.  Fundamentals of Wireless Communication , 2005 .

[30]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[31]  Honghai Zhang,et al.  Weighted Sum-Rate Maximization in Multi-Cell Networks via Coordinated Scheduling and Discrete Power Control , 2011, IEEE Journal on Selected Areas in Communications.

[32]  Damien Ernst,et al.  How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies , 2015, ArXiv.

[33]  Euhanna Ghadimi,et al.  A reinforcement learning approach to power control and rate adaptation in cellular networks , 2016, 2017 IEEE International Conference on Communications (ICC).

[34]  Michael P. Wellman,et al.  Online learning about other agents in a dynamic multiagent system , 1998, AGENTS '98.

[35]  Shimon Whiteson,et al.  Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.

[36]  Jeremy Watt,et al.  Machine Learning Refined: Foundations, Algorithms, and Applications , 2016 .

[37]  Ranjan K. Mallik,et al.  A Machine Learning Approach for Power Allocation in HetNets Considering QoS , 2018, 2018 IEEE International Conference on Communications (ICC).

[38]  Michael L. Honig,et al.  Distributed interference compensation for wireless networks , 2006, IEEE Journal on Selected Areas in Communications.

[39]  Wei Yu,et al.  Fractional Programming for Communication Systems—Part I: Power Control and Beamforming , 2018, IEEE Transactions on Signal Processing.

[40]  Geoffrey Ye Li,et al.  Deep Reinforcement Learning for Resource Allocation in V2V Communications , 2017, 2018 IEEE International Conference on Communications (ICC).

[41]  Li Wang,et al.  Learning Radio Resource Management in RANs: Framework, Opportunities, and Challenges , 2018, IEEE Communications Magazine.

[42]  Kobi Cohen,et al.  Deep Multi-User Reinforcement Learning for Dynamic Spectrum Access in Multichannel Wireless Networks , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[43]  Soung Chang Liew,et al.  Deep-Reinforcement Learning Multiple Access for Heterogeneous Wireless Networks , 2017, 2018 IEEE International Conference on Communications (ICC).

[44]  Dongning Guo,et al.  Deep Reinforcement Learning for Distributed Dynamic Power Allocation in Wireless Networks , 2018, ArXiv.

[45]  Tommi S. Jaakkola,et al.  Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[46]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[47]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[48]  Xianfu Chen,et al.  Deep Reinforcement Learning for Network Slicing , 2018, ArXiv.