论文信息 - Function Approximation Based Reinforcement Learning for Edge Caching in Massive MIMO Networks

Function Approximation Based Reinforcement Learning for Edge Caching in Massive MIMO Networks

Caching popular contents in advance is an important technique to achieve low latency and reduced backhaul congestion in future wireless communication systems. In this article, a multi-cell massive multi-input-multi-output system is considered, where locations of base stations are distributed as a Poisson point process. Assuming probabilistic caching, average success probability (ASP) of the system is derived for a known content popularity (CP) profile, which in practice is time-varying and unknown in advance. Further, modeling CP variations across time as a Markov process, reinforcement <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning is employed to learn the optimal content placement strategy to optimize the long-term-discounted ASP and average cache refresh rate. In the <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning, the number of <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-updates are large and proportional to the number of states and actions. To reduce the space complexity and update requirements towards scalable <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning, two novel (linear and non-linear) function approximations-based <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning approaches are proposed, where only a constant (4 and 3 respectively) number of variables need updation, irrespective of the number of states and actions. Convergence of these approximation-based approaches are analyzed. Simulations verify that these approaches converge and successfully learn the similar best content placement, which shows the successful applicability and scalability of the proposed approximated <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-learning schemes.

[1] Sean P. Meyn,et al. An analysis of reinforcement learning with function approximation , 2008, ICML '08.

[2] Seong-Lyun Kim,et al. Downlink capacity and base station density in cellular networks , 2011, 2013 11th International Symposium and Workshops on Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt).

[3] Dong Liu,et al. Caching at the wireless edge: design aspects, challenges, and future directions , 2016, IEEE Communications Magazine.

[4] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[5] Konstantin Avrachenkov,et al. Optimization of caching devices with geometric constraints , 2017, Perform. Evaluation.

[6] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[7] Choong Seon Hong,et al. Cache Aware User Association for Wireless Heterogeneous Networks , 2019, IEEE Access.

[8] Deniz Gündüz,et al. Reinforcement Learning for Proactive Caching of Contents with Different Demand Probabilities , 2018, 2018 15th International Symposium on Wireless Communication Systems (ISWCS).

[9] Bartlomiej Blaszczyszyn,et al. Optimal geographic caching in cellular networks , 2014, 2015 IEEE International Conference on Communications (ICC).

[10] Weiping Li,et al. PPC: Popularity Prediction Caching in ICN , 2018, IEEE Communications Letters.

[11] Jiqiang Wu,et al. Modeling Dynamics of Online Video Popularity , 2016, IEEE Transactions on Multimedia.

[12] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[13] Dong Liu,et al. Caching Policy Toward Maximal Success Probability and Area Spectral Efficiency of Cache-Enabled HetNets , 2016, IEEE Transactions on Communications.

[14] Xiaohu You,et al. User Preference Learning-Based Edge Caching for Fog Radio Access Network , 2018, IEEE Transactions on Communications.

[15] Wai-Xi Liu,et al. Content Popularity Prediction and Caching for ICN: A Deep Learning Approach With SDN , 2018, IEEE Access.

[16] Jasper Goseling,et al. On Optimal Geographical Caching in Heterogeneous Cellular Networks , 2016, 2017 IEEE Wireless Communications and Networking Conference (WCNC).

[17] Alireza Sadeghi,et al. Optimal and Scalable Caching for 5G Using Reinforcement Learning of Space-Time Popularities , 2017, IEEE Journal of Selected Topics in Signal Processing.

[18] Shaoqian Li,et al. A Reinforcement Learning Based Smart Cache Strategy for Cache-Aided Ultra-Dense Network , 2019, IEEE Access.

[19] Zhu Han,et al. A prediction-based coordination caching scheme for content centric networking , 2018, 2018 27th Wireless and Optical Communication Conference (WOCC).

[20] Tharmalingam Ratnarajah,et al. Performance Analysis of Cloud Radio Access Networks With Distributed Multiple Antenna Remote Radio Heads , 2015, IEEE Transactions on Signal Processing.

[21] Omar Y. Al-Jarrah,et al. Popularity-Based Video Caching Techniques for Cache-Enabled Networks: A Survey , 2019, IEEE Access.

[22] Rick S. Blum,et al. A Survey of Caching Techniques in Cellular Networks: Research Issues and Challenges in Content Placement and Delivery Strategies , 2018, IEEE Communications Surveys & Tutorials.

[23] H. Vincent Poor,et al. A Learning-Based Approach to Caching in Heterogenous Small Cell Networks , 2015, IEEE Transactions on Communications.

[24] Konstantin Avrachenkov,et al. A Low-Complexity Approach to Distributed Cooperative Caching with Geographic Constraints , 2017, Proc. ACM Meas. Anal. Comput. Syst..

[25] Rashid Mehmood,et al. UbeHealth: A Personalized Ubiquitous Cloud and Edge-Enabled Networked Healthcare System for Smart Cities , 2018, IEEE Access.

[26] Govind Sharma,et al. Partially Loaded Superimposed Training Scheme for Large MIMO Uplink Systems , 2018, Wirel. Pers. Commun..

[27] Hiroki Nakayama,et al. Caching algorithm for content-oriented networks using prediction of popularity of contents , 2015, 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM).

[28] Tharmalingam Ratnarajah,et al. Content Placement Learning for Success Probability Maximization in Wireless Edge Caching Networks , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29] Liang Yin,et al. Coverage Analysis of Multiuser Visible Light Communication Networks , 2018, IEEE Transactions on Wireless Communications.

[30] Alexandros G. Dimakis,et al. FemtoCaching: Wireless Content Delivery Through Distributed Caching Helpers , 2013, IEEE Transactions on Information Theory.

[31] Konstantinos Poularakis,et al. On the Complexity of Optimal Content Placement in Hierarchical Caching Networks , 2016, IEEE Transactions on Communications.

[32] Rui Wang,et al. Analysis and Optimization of Caching in Fog Radio Access Networks , 2019, IEEE Transactions on Vehicular Technology.

[33] Jeffrey G. Andrews,et al. Analytical Modeling of Uplink Cellular Networks , 2012, IEEE Transactions on Wireless Communications.

[34] Donald F. Towsley,et al. The Role of Caching in Future Communication Systems and Networks , 2018, IEEE Journal on Selected Areas in Communications.

[35] Tharmalingam Ratnarajah,et al. Online Content Popularity Prediction and Learning in Wireless Edge Caching , 2020, IEEE Transactions on Communications.