A Collaborative Learning Based Approach for Parameter Configuration of Cellular Networks

Cellular network performance depends heavily on the configuration of its network parameters. Current practice of parameter configuration relies largely on expert experience, which is often suboptimal, time-consuming, and error-prone. Therefore, it is desirable to automate this process to improve the accuracy and efficiency via learning-based approaches. However, such approaches need to address several challenges in real operational networks: the lack of diverse historical data, a limited amount of experiment budget set by network operators, and highly complex and unknown network performance functions. To address those challenges, we propose a collaborative learning approach to leverage data from different cells to boost the learning efficiency and to improve network performance. Specifically, we formulate the problem as a transferable contextual bandit problem, and prove that by transfer learning, one could significantly reduce the regret bound. Based on the theoretical result, we further develop a practical algorithm that decomposes a cell’s policy into a common homogeneous policy learned using all cells’ data and a cell-specific policy that captures each individual cell’s heterogeneous behavior. We evaluate our proposed algorithm via a simulator constructed using real network data and demonstrates faster convergence compared to baselines. More importantly, a live field test is also conducted on a real metropolitan cellular network consisting 1700+ cells to optimize five parameters for two weeks. Our proposed algorithm shows a significant performance improvement of 20%.

[1]  Richard M. Johnstone,et al.  Exponential convergence of recursive least squares with exponential forgetting factor , 1982, 1982 21st IEEE Conference on Decision and Control.

[2]  Xin Liu,et al.  Learning-Based Task Offloading for Vehicular Cloud Computing Systems , 2018, 2018 IEEE International Conference on Communications (ICC).

[3]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[4]  Xianfu Chen,et al.  Energy-Efficiency Oriented Traffic Offloading in Wireless Networks: A Brief Survey and a Learning Approach for Heterogeneous Cellular Networks , 2015, IEEE Journal on Selected Areas in Communications.

[5]  Xiang Cheng,et al.  Exploiting Mobile Big Data: Sources, Features, and Applications , 2017, IEEE Network.

[6]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[7]  Ivor W. Tsang,et al.  Learning with Augmented Features for Heterogeneous Domain Adaptation , 2012, ICML.

[8]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[9]  P. Legg,et al.  A network controlled handover mechanism and its optimization in LTE heterogeneous networks , 2013, 2013 IEEE Wireless Communications and Networking Conference (WCNC).

[10]  Zhisheng Niu,et al.  Task Replication for Vehicular Edge Computing: A Combinatorial Multi-Armed Bandit Based Approach , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[11]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[12]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[13]  Satoshi Nagata,et al.  LTE-advanced: an operator perspective , 2012, IEEE Communications Magazine.

[14]  H. Vincent Poor,et al.  Reinforcement Learning-Based NOMA Power Allocation in the Presence of Smart Jamming , 2018, IEEE Transactions on Vehicular Technology.

[15]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[16]  Nello Cristianini,et al.  Finite-Time Analysis of Kernelised Contextual Bandits , 2013, UAI.

[17]  Xueying Guo,et al.  Index policies for optimal mean-variance trade-off of inter-delivery times in real-time sensor networks , 2015, 2015 IEEE Conference on Computer Communications (INFOCOM).

[18]  Erik Dahlman,et al.  4G: LTE/LTE-Advanced for Mobile Broadband , 2011 .

[19]  Wei Chu,et al.  Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.

[20]  Lihong Li,et al.  An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[21]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[22]  Jacob A. Wegelin,et al.  A Survey of Partial Least Squares (PLS) Methods, with Emphasis on the Two-Block Case , 2000 .

[23]  Atilla Eryilmaz,et al.  Asymptotically optimal downlink scheduling over Markovian fading channels , 2012, 2012 Proceedings IEEE INFOCOM.

[24]  Chang Wang,et al.  Heterogeneous Domain Adaptation Using Manifold Alignment , 2011, IJCAI.

[25]  Zhitang Chen,et al.  Learning-Based Joint Configuration for Cellular Networks , 2018, IEEE Internet of Things Journal.

[26]  Andreas Krause,et al.  Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting , 2009, IEEE Transactions on Information Theory.

[27]  Zhi Ding,et al.  Big data aware wireless communication: challenges and opportunities , 2016, Big Data over Networks.

[28]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[29]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[30]  Zhisheng Niu,et al.  Optimal energy-efficient regular delivery of packets in cyber-physical systems , 2015, 2015 IEEE International Conference on Communications (ICC).

[31]  Xin Liu,et al.  Cellular network configuration via online learning and joint optimization , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[32]  P. T. V. Bhuvaneswari,et al.  A study on handover parameter optimization in LTE-A networks , 2016, 2016 International Conference on Microelectronics, Computing and Communications (MicroCom).

[33]  Josselin Garnier,et al.  Asymptotic analysis of the learning curve for Gaussian process regression , 2014, Machine Learning.

[34]  Qiang Yang,et al.  Transfer Learning via Dimensionality Reduction , 2008, AAAI.

[35]  Huasen Wu,et al.  Double Thompson Sampling for Dueling Bandits , 2016, NIPS.