Learn to adapt: Self-optimizing small cell transmit power with correlated bandit learning

Judiciously setting the base station transmit power that matches its deployment environment is a key problem in ultra dense networks and heterogeneous in-building cellular deployments. A unique characteristic of this problem is the tradeoff between sufficient indoor coverage and limited outdoor leakage, which has to be met without explicit knowledge of the environment. In this paper, we address the small base station (SBS) transmit power assignment problem based on stochastic bandit theory. We explicitly consider power switching penalties to discourage frequent changes of the transmit power, which causes varying coverage and uneven user experience. Unlike existing solutions that rely on RF surveys in the target area, we take advantage of the user behavior with simple coverage feedback in the network. In addition, the proposed power assignment algorithms follow the Bayesian principle to utilize the available prior knowledge and correlation structure from the self configuration phase. Simulations mimicking practical deployments are performed for both single and multiple SBS scenarios, and the resulting power settings are compared to the state-of-the-art solutions. Significant performance gains of the proposed algorithms are observed.

[1]  Yi Jiang,et al.  Downlink Transmit Power Calibration for Enterprise Femtocells , 2011, 2011 IEEE Vehicular Technology Conference (VTC Fall).

[2]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[3]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[4]  D. Teneketzis,et al.  Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching cost , 1988 .

[5]  Mihaela van der Schaar,et al.  Global Multi-armed Bandits with Hölder Continuity , 2014, AISTATS.

[6]  Yonghui Song,et al.  Multi-Armed Bandit Channel Access Scheme With Cognitive Radio Technology in Wireless Sensor Networks for the Internet of Things , 2016, IEEE Access.

[7]  Zhu Han,et al.  Self-Organization in Small Cell Networks: A Reinforcement Learning Approach , 2013, IEEE Transactions on Wireless Communications.

[8]  Vaibhav Srivastava,et al.  Correlated Multiarmed Bandit Problem: Bayesian Algorithms and Regret Analysis , 2015, ArXiv.

[9]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[10]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 1985 .

[11]  Tony Q. S. Quek,et al.  Small Cell Networks: Deployment, PHY Techniques, and Resource Management , 2013 .

[12]  Aurélien Garivier,et al.  On Bayesian Upper Confidence Bounds for Bandit Problems , 2012, AISTATS.

[13]  Muhammad Ali Imran,et al.  A Survey of Self Organisation in Future Cellular Networks , 2013, IEEE Communications Surveys & Tutorials.

[14]  Saleh R. Al-Araji,et al.  MDP based dynamic base station management for power conservation in self-organizing networks , 2014, 2014 IEEE Wireless Communications and Networking Conference (WCNC).

[15]  Hae-Sang Park,et al.  A simple and fast algorithm for K-medoids clustering , 2009, Expert Syst. Appl..