论文信息 - Kernel-based Multi-Task Contextual Bandits in Cellular Network Configuration

Kernel-based Multi-Task Contextual Bandits in Cellular Network Configuration

Cellular network configuration plays a critical role n network performance. In current practice, network configuration depends heavily on field experience of engineers and often remains static for a long period of time. This practice is far from optimal. To address this limitation, online-learning-based approaches have great potentials to automate and optimize network configuration. Learning-based approaches face the challenges of learning a highly complex function for each base station and balancing the fundamental exploration-exploitation tradeoff while minimizing the exploration cost. Fortunately, in cellular networks, base stations (BSs) often have similarities even though they are not identical. To leverage such similarities, we propose kernel-based multi-BS contextual bandit algorithm based on multi-task learning. In the algorithm, we leverage the similarity among different BSs defined by conditional kernel embedding. We present theoretical analysis of the proposed algorithm in terms of regret and multi-task-learning efficiency. We evaluate the effectiveness of our algorithm based on a simulator built by real traces.

[1] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.

[2] Holger Claussen,et al. Distributed Radio Coverage Optimization in Enterprise Femtocell Networks , 2010, 2010 IEEE International Conference on Communications.

[3] Robert Tibshirani,et al. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[4] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[5] Xianfu Chen,et al. Energy-Efficiency Oriented Traffic Offloading in Wireless Networks: A Brief Survey and a Learning Approach for Heterogeneous Cellular Networks , 2015, IEEE Journal on Selected Areas in Communications.

[6] Ürün Dogan,et al. Multi-Task Learning for Contextual Bandits , 2017, NIPS.

[7] Xin Liu,et al. Cellular network configuration via online learning and joint optimization , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[8] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[9] Massimiliano Pontil,et al. Regularized multi--task learning , 2004, KDD.

[10] Edwin V. Bonilla,et al. Kernel Multi-task Learning using Task-specific Features , 2007, AISTATS.

[11] Andreas Krause,et al. Contextual Gaussian Process Bandit Optimization , 2011, NIPS.

[12] Fuzhen Zhang. The Schur complement and its applications , 2005 .

[13] Edwin V. Bonilla,et al. Multi-task Gaussian Process Prediction , 2007, NIPS.

[14] Rouzbeh Razavi,et al. Self-configuring Switched Multi-Element Antenna system for interference mitigation in femtocell networks , 2011, 2011 IEEE 22nd International Symposium on Personal, Indoor and Mobile Radio Communications.

[15] Cong Shen,et al. Generalized Global Bandit and Its Application in Cellular Coverage Optimization , 2018, IEEE Journal of Selected Topics in Signal Processing.

[16] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[17] Kimmo Valkealahti,et al. WCDMA common pilot power control for load and coverage balancing , 2002, The 13th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications.

[18] Zhisheng Niu,et al. Delay-Constrained Energy-Optimal Base Station Sleeping Control , 2016, IEEE Journal on Selected Areas in Communications.

[19] Peter Auer,et al. Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[20] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[21] Nello Cristianini,et al. Finite-Time Analysis of Kernelised Contextual Bandits , 2013, UAI.

[22] Meila,et al. Kernel multitask learning using task-specific features , 2007 .

[23] Di Yuan,et al. A decomposition method for pilot power planning in UMTS systems , 2012, 2012 Second International Conference on Digital Information and Communication Technology and it's Applications (DICTAP).

[24] Jack Bowden,et al. Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges. , 2015, Statistical science : a review journal of the Institute of Mathematical Statistics.

[25] Le Song,et al. A Hilbert Space Embedding for Distributions , 2007, Discovery Science.