GAN-Based Deep Distributional Reinforcement Learning for Resource Management in Network Slicing

Network slicing is a key technology in 5G communications system, which aims to dynamically and efficiently allocate resources for diversified services with distinct requirements over a common underlying physical infrastructure. Therein, demand-aware allocation is of significant importance to network slicing. In this paper, we consider a scenario that contains several slices in one base station on sharing the same bandwidth. Deep reinforcement learning (DRL) is leveraged to solve this problem by regarding the varying demands and the allocated bandwidth as the environment \emph{state} and \emph{action}, respectively. In order to obtain better quality of experience (QoE) satisfaction ratio and spectrum efficiency (SE), we propose generative adversarial network (GAN) based deep distributional Q network (GAN- DDQN) to learn the distribution of state-action values. Furthermore, we estimate the distributions by approximating a full quantile function, which can make the training error more controllable. In order to protect the stability of GAN-DDQN's training process from the widely-spanning utility values, we also put forward a reward-clipping mechanism. Finally, we verify the performance of the proposed GAN-DDQN algorithm through extensive simulations.

[1]  Navid Nikaein,et al.  Network Slices toward 5G Communications: Slicing the LTE Network , 2017, IEEE Communications Magazine.

[2]  Zhi Chen,et al.  Intelligent Power Control for Spectrum Sharing in Cognitive Radios: A Deep Reinforcement Learning Approach , 2017, IEEE Access.

[3]  Qinru Qiu,et al.  A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[4]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[5]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[6]  Naomi S. Altman,et al.  Quantile regression , 2019, Nature Methods.

[7]  Xianfu Chen,et al.  Deep Reinforcement Learning for Resource Management in Network Slicing , 2018, IEEE Access.

[8]  Francisco J. Martinez,et al.  When Vehicular Networks meet Artificial Intelligence , 2017, 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI).

[9]  Jing Wang,et al.  A deep reinforcement learning based framework for power-efficient resource allocation in cloud RANs , 2017, 2017 IEEE International Conference on Communications (ICC).

[10]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[11]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[12]  Lazaros Gkatzikis,et al.  The Algorithmic Aspects of Network Slicing , 2017, IEEE Communications Magazine.

[13]  Rémi Munos,et al.  Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.

[14]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[15]  Marc G. Bellemare,et al.  A Distributional Perspective on Reinforcement Learning , 2017, ICML.

[16]  Victor C. M. Leung,et al.  Software-Defined Networks with Mobile Edge Computing and Caching for Smart Cities: A Big Data Deep Reinforcement Learning Approach , 2017, IEEE Communications Magazine.

[17]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[18]  Marc G. Bellemare,et al.  Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.

[19]  Matthew W. Hoffman,et al.  Distributed Distributional Deterministic Policy Gradients , 2018, ICLR.

[20]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[21]  Mahesh K. Marina,et al.  Network Slicing in 5G: Survey and Challenges , 2017, IEEE Communications Magazine.