Linear Approximation based Q-Learning for Edge Caching in Massive MIMO Networks

To meet increasing demands in wireless multimedia communications, caching of important contents in advance is one of the key solutions. Optimal caching depends on content popularity in future which is unknown. In this paper, modeling content popularity as a finite state Markov chain, reinforcement Q-learning is employed to learn optimal content placement strategy to maximize the average success probability (ASP) in homogeneous Poisson point process (PPP) distributed caching network having massive MIMO base stations. To improve over Q-learning, a linear function approximation based Q-learning is proposed which shows that only a constant number of (three) parameters need updation irrespective of size of state and action sets, while Q-learning in this context requires the parameters update of size number of states times number of actions. Given a set of available placement strategies, simulations show that the approximate Q-learning converge, successfully learns and provides the same best content placement as Q-learning, which shows the successful applicability and scalability of the approximate Q-learning.

[1]  Jasper Goseling,et al.  Optimal Geographical Caching in Heterogeneous Cellular Networks with Nonhomogeneous Helpers , 2017, ArXiv.

[2]  H. Vincent Poor,et al.  A Learning-Based Approach to Caching in Heterogenous Small Cell Networks , 2015, IEEE Transactions on Communications.

[3]  Konstantinos Poularakis,et al.  On the Complexity of Optimal Content Placement in Hierarchical Caching Networks , 2016, IEEE Transactions on Communications.

[4]  Dong Liu,et al.  Caching Policy Toward Maximal Success Probability and Area Spectral Efficiency of Cache-Enabled HetNets , 2016, IEEE Transactions on Communications.

[5]  Dong Liu,et al.  Caching at the wireless edge: design aspects, challenges, and future directions , 2016, IEEE Communications Magazine.

[6]  Konstantin Avrachenkov,et al.  Optimization of caching devices with geometric constraints , 2017, Perform. Evaluation.

[7]  Liang Yin,et al.  Coverage Analysis of Multiuser Visible Light Communication Networks , 2018, IEEE Transactions on Wireless Communications.

[8]  Wai-Xi Liu,et al.  Content Popularity Prediction and Caching for ICN: A Deep Learning Approach With SDN , 2018, IEEE Access.

[9]  Urs Niesen,et al.  Fundamental limits of caching , 2012, 2013 IEEE International Symposium on Information Theory.

[10]  Alireza Sadeghi,et al.  Optimal and Scalable Caching for 5G Using Reinforcement Learning of Space-Time Popularities , 2017, IEEE Journal of Selected Topics in Signal Processing.

[11]  Zhu Han,et al.  A prediction-based coordination caching scheme for content centric networking , 2018, 2018 27th Wireless and Optical Communication Conference (WOCC).

[12]  Xiaohu You,et al.  User Preference Learning-Based Edge Caching for Fog Radio Access Network , 2018, IEEE Transactions on Communications.

[13]  Alexandros G. Dimakis,et al.  FemtoCaching: Wireless Content Delivery Through Distributed Caching Helpers , 2013, IEEE Transactions on Information Theory.

[14]  Bartlomiej Blaszczyszyn,et al.  Optimal geographic caching in cellular networks , 2014, 2015 IEEE International Conference on Communications (ICC).

[15]  Anja Klein,et al.  Context-Aware Proactive Content Caching With Service Differentiation in Wireless Networks , 2016, IEEE Transactions on Wireless Communications.

[16]  Mihaela van der Schaar,et al.  Trend-Aware Video Caching Through Online Learning , 2016, IEEE Transactions on Multimedia.

[17]  Weiping Li,et al.  PPC: Popularity Prediction Caching in ICN , 2018, IEEE Communications Letters.

[18]  Hiroki Nakayama,et al.  Caching algorithm for content-oriented networks using prediction of popularity of contents , 2015, 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM).