Deep Reinforcement Learning-Based Resource Allocation and Power Control in Small Cells With Limited Information Exchange

In multi-user downlink small cell networks, cooperative resource allocation (RA) within a small cell cluster is a key technique to enhance network capacity. However, capacity-maximizing RA in frequency-selective fading channels requires global channel state information (CSI) of users within a small cell cluster, which makes it infeasible in practical networks with limited direct link capacity. To circumvent this global CSI assumption, most of the existing studies on RA have been based on several CSI assumptions such as local CSI and local CSI at the transmitters (CSIT). Nevertheless, cost functions with local CSI or local CSIT in the literature rely on heuristic formulations, because the sum-rate cannot be computed if without global CSI. In this paper, we propose a deep reinforcement learning-based RA algorithm to maximize the sum-rate for any given limited information on instantaneous CSI or sum-rate at the previous period. The proposed scheme is not restricted to certain CSI assumptions, but attempts to find the best RA for any given information such as quantized local CSI and quantized local CSIT; thus, it is applicable to any given direct link capacity. The proposed algorithm is self-adaptive in time-varying channels, since it is not divided into training and test phases. We modify the target neural network (TNN) scheme to enhance the sum-rate and the convergence speed. Numerical simulations confirm that: i) the proposed algorithm outperforms the conventional algorithms even under the same CSI assumption such as local CSI and local CSIT; ii) a flexible trade-off between the amount of CSI and the sum-rate is realizable in practical systems.

[1]  Qihui Wu,et al.  Demand‐aware resource allocation for ultra‐dense small cell networks: an interference‐separation clustering‐based solution , 2016, Trans. Emerg. Telecommun. Technol..

[2]  Peter Sunehag,et al.  Reinforcement Learning in Large Discrete Action Spaces , 2015, ArXiv.

[3]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[4]  Xin Zhou,et al.  Dynamic resource allocations based on Q-learning for D2D communication in cellular networks , 2014, 2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing(ICCWAMTIP).

[5]  Fumiyuki Adachi,et al.  Deep-Learning-Based Millimeter-Wave Massive MIMO for Hybrid Precoding , 2019, IEEE Transactions on Vehicular Technology.

[6]  Gerhard Fettweis,et al.  Robust Rate Adaptation and Proportional Fair Scheduling With Imperfect CSI , 2015, IEEE Transactions on Wireless Communications.

[7]  Geoffrey Ye Li,et al.  Machine Learning for Vehicular Networks: Recent Advances and Application Examples , 2018, IEEE Vehicular Technology Magazine.

[8]  Jeffrey G. Andrews,et al.  SINR and Throughput Scaling in Ultradense Urban Cellular Networks , 2015, IEEE Wireless Communications Letters.

[9]  Athina P. Petropulu,et al.  A Deep Learning Framework for Optimization of MISO Downlink Beamforming , 2019, IEEE Transactions on Communications.

[10]  Pan Li,et al.  Channel State Information Prediction for 5G Wireless Communications: A Deep Learning Approach , 2020, IEEE Transactions on Network Science and Engineering.

[11]  Peter J. Smith,et al.  Coordinated Regularized Zero-Forcing Precoding for Multicell MISO Systems With Limited Feedback , 2017, IEEE Transactions on Vehicular Technology.

[12]  Geoffrey Ye Li,et al.  Power of Deep Learning for Channel Estimation and Signal Detection in OFDM Systems , 2017, IEEE Wireless Communications Letters.

[13]  Theodoros A. Tsiftsis,et al.  Hierarchical Resource Allocation Framework for Hyper-Dense Small Cell Networks , 2016, IEEE Access.

[14]  Myoung-Seok Kim,et al.  Decentralized Precoding Algorithm with Weighted SLNR for Limitedly Coordinated Network , 2012, IEEE Communications Letters.

[15]  Jing Wang,et al.  A deep reinforcement learning based framework for power-efficient resource allocation in cloud RANs , 2017, 2017 IEEE International Conference on Communications (ICC).

[16]  Xianbin Wang,et al.  Deep Learning-Based Beam Management and Interference Coordination in Dense mmWave Networks , 2019, IEEE Transactions on Vehicular Technology.

[17]  Geoffrey Ye Li,et al.  Deep Reinforcement Learning for Resource Allocation in V2V Communications , 2017, 2018 IEEE International Conference on Communications (ICC).

[18]  Hyun Jong Yang,et al.  Sum-Rate Maximization of Multicell MISO Networks With Limited Information Exchange , 2020, IEEE Transactions on Vehicular Technology.

[19]  Phond Phunchongharn,et al.  Joint cell selection and subchannel allocation for energy efficiency in small cell networks: A coalitional game , 2015, 2015 International Conference on Wireless Communications & Signal Processing (WCSP).

[20]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[21]  Yueming Cai,et al.  Optimal Power Control in Ultra-Dense Small Cell Networks: A Game-Theoretic Approach , 2017, IEEE Transactions on Wireless Communications.

[22]  Hyun Jong Yang,et al.  Resource Allocation and Power Control in Cooperative Small Cell Networks With Backhaul Constraint , 2019, IEEE Transactions on Vehicular Technology.

[23]  Wei Yu,et al.  Multi-Cell MIMO Cooperative Networks: A New Look at Interference , 2010, IEEE Journal on Selected Areas in Communications.

[24]  Xiaohu You,et al.  Energy-Efficient Resource Allocation in Multi-Cell OFDMA Systems with Imperfect CSI , 2015, 2015 IEEE 82nd Vehicular Technology Conference (VTC2015-Fall).

[25]  Richard Evans,et al.  Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.

[26]  Guan Gui,et al.  Deep Learning-Inspired Message Passing Algorithm for Efficient Resource Allocation in Cognitive Radio Networks , 2019, IEEE Transactions on Vehicular Technology.

[27]  Hideaki Sakai,et al.  Interference Mitigation Based on Partial CSI Feedback and Overhearing in an OFDMA Heterogeneous System , 2013, 2013 IEEE 77th Vehicular Technology Conference (VTC Spring).

[28]  Kobi Cohen,et al.  Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access , 2017, IEEE Transactions on Wireless Communications.

[29]  A. S. Madhukumar,et al.  Stackelberg Bayesian Game for Power Allocation in Two-Tier Networks , 2016, IEEE Transactions on Vehicular Technology.

[30]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[31]  Stephan ten Brink,et al.  Deep Learning Based Communication Over the Air , 2017, IEEE Journal of Selected Topics in Signal Processing.

[32]  David Gesbert,et al.  Distributed Beamforming Coordination in Multicell MIMO Channels , 2009, VTC Spring 2009 - IEEE 69th Vehicular Technology Conference.

[33]  Jianzhong Zhang,et al.  Interference Alignment for Downlink Multi-Cell LTE-Advanced Systems With Limited Feedback , 2016, IEEE Transactions on Wireless Communications.

[34]  Erik G. Larsson,et al.  Physical Adversarial Attacks Against End-to-End Autoencoder Communication Systems , 2019, IEEE Communications Letters.

[35]  Anima Anandkumar,et al.  Robust Rate Maximization Game Under Bounded Channel Uncertainty , 2010, IEEE Transactions on Vehicular Technology.

[36]  Wei Xu,et al.  Uplink interference mitigation for heterogeneous networks with user-specific resource allocation and power control , 2014, EURASIP J. Wirel. Commun. Netw..

[37]  Nan Zhao,et al.  Integrated Networking, Caching, and Computing for Connected Vehicles: A Deep Reinforcement Learning Approach , 2018, IEEE Transactions on Vehicular Technology.

[38]  Lenan Wu,et al.  Power Allocation in Multi-User Cellular Networks: Deep Reinforcement Learning Approaches , 2019, IEEE Transactions on Wireless Communications.

[39]  Stephan ten Brink,et al.  OFDM-Autoencoder for End-to-End Learning of Communications Systems , 2018, 2018 IEEE 19th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[40]  Emil Björnson,et al.  Cooperative Multicell Precoding: Rate Region Characterization and Distributed Strategies With Instantaneous and Statistical CSI , 2010, IEEE Transactions on Signal Processing.

[41]  Ming Chen,et al.  Pricing-based distributed power control for weighted sum energy-efficiency maximization in ad hoc networks , 2014, 2014 IEEE Global Communications Conference.

[42]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[43]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[44]  Hideaki Sakai,et al.  Distributed Resource Allocation With Local CSI Overhearing and Scheduling Prediction for OFDMA Heterogeneous Networks , 2017, IEEE Transactions on Vehicular Technology.

[45]  Alagan Anpalagan,et al.  Downlink Power Control in Two-Tier Cellular OFDMA Networks Under Uncertainties: A Robust Stackelberg Game , 2015, IEEE Transactions on Communications.