Small Cell Power Assignment with Unimodal Continuum-Armed Bandit Learning

Judiciously setting the base station transmit power that matches its deployment environment is a key problem in ultra dense networks and heterogeneous in-building cellular deployments. A unique characteristic of this problem is the tradeoff between sufficient indoor coverage and limited outdoor leakage, which has to be met without explicit knowledge of the environment. In this paper, we address the small base station(SBS) transmit power assignment problem based on stochastic bandit learning with a continuous set of arms to avoid the constant performance loss or heavy workload on initialization caused by crude or excessive sampling in the previous strategies. With the aim of minimizing the expected cumulative performance loss, we capture the unimodality of the performance function which efficiently accelerates the search for the globally optimal power value. Simulations mimicking practical deployments are performed for both single and multiple SBS scenarios, and the resulting power settings are compared to the state-of-the-art solutions. Significant performance gains of the proposed algorithms are observed.

[1]  Shie Mannor,et al.  Unimodal Bandits , 2011, ICML.

[2]  Yi Jiang,et al.  Downlink Transmit Power Calibration for Enterprise Femtocells , 2011, 2011 IEEE Vehicular Technology Conference (VTC Fall).

[3]  Yen-Ching Chang,et al.  N-Dimension Golden Section Search: Its Variants and Limitations , 2009, 2009 2nd International Conference on Biomedical Engineering and Informatics.

[4]  R. Agrawal The Continuum-Armed Bandit Problem , 1995 .

[5]  Jean-Yves Le Boudec,et al.  Rate adaptation, Congestion Control and Fairness: A Tutorial , 2000 .

[6]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[7]  Alexandre Proutière,et al.  Unimodal Bandits without Smoothness , 2014, ArXiv.

[8]  Erik D. Demaine,et al.  Optimizing a 2D Function Satisfying Unimodality Properties , 2005, ESA.

[9]  Zhu Han,et al.  Self-Organization in Small Cell Networks: A Reinforcement Learning Approach , 2013, IEEE Transactions on Wireless Communications.

[10]  Setareh Maghsudi,et al.  Joint Channel Selection and Power Control in Infrastructureless Wireless Networks: A Multiplayer Multiarmed Bandit Framework , 2014, IEEE Transactions on Vehicular Technology.

[11]  Robert D. Kleinberg Nearly Tight Bounds for the Continuum-Armed Bandit Problem , 2004, NIPS.

[12]  Vinay Chande,et al.  Transmit power self-calibration for residential UMTS/HSPA+ femtocells , 2011, 2011 International Symposium of Modeling and Optimization of Mobile, Ad Hoc, and Wireless Networks.

[13]  Cong Shen,et al.  Small Cell Transmit Power Assignment Based on Correlated Bandit Learning , 2017, IEEE Journal on Selected Areas in Communications.

[14]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[15]  Saleh R. Al-Araji,et al.  MDP based dynamic base station management for power conservation in self-organizing networks , 2014, 2014 IEEE Wireless Communications and Networking Conference (WCNC).

[16]  Holger Claussen,et al.  Minimising cell transmit power , 2011, SIGCOMM 2011.

[17]  Shie Mannor,et al.  Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems , 2006, J. Mach. Learn. Res..