Adaptive Caching via Deep Reinforcement Learning

Caching is envisioned to play a critical role in next-generation content delivery infrastructure, cellular networks, and Internet architectures. By smartly storing the most popular contents at the storage-enabled network entities during off-peak demand instances, caching can benefit both network infrastructure as well as end users, during on-peak periods. In this context, distributing the limited storage capacity across network entities calls for decentralized caching schemes. Many practical caching systems involve a parent caching node connected to multiple leaf nodes to serve user file requests. To model the two-way interactive influence between caching decisions at the parent and leaf nodes, a reinforcement learning framework is put forth. To handle the large continuous state space, a scalable deep reinforcement learning approach is pursued. The novel approach relies on a deep Q-network to learn the Q-function, and thus the optimal caching policy, in an online fashion. Reinforcing the parent node with ability to learn-and-adapt to unknown policies of leaf nodes as well as spatio-temporal dynamic evolution of file requests, results in remarkable caching performance, as corroborated through numerical tests.

[1]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[2]  Gang Wang,et al.  Real-Time Power System State Estimation and Forecasting via Deep Unrolled Neural Networks , 2018, IEEE Transactions on Signal Processing.

[3]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[4]  Mehdi Bennis,et al.  Distributed Edge Caching in Ultra-Dense Fog Radio Access Networks: A Mean Field Approach , 2018, 2018 IEEE 88th Vehicular Technology Conference (VTC-Fall).

[5]  Urs Niesen,et al.  Online coded caching , 2014, ICC.

[6]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[7]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[8]  Bart De Schutter,et al.  Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[9]  Urs Niesen,et al.  Fundamental Limits of Caching , 2014, IEEE Trans. Inf. Theory.

[10]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[11]  Zhi-Li Zhang,et al.  DeepCache: A Deep Learning Based Framework For Content Caching , 2018, NetAI@SIGCOMM.

[12]  Georgios B. Giannakis,et al.  Bandit Online Learning with Unknown Delays , 2018, AISTATS.

[13]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[14]  R. Michael Buehrer,et al.  Learning distributed caching strategies in small cell networks , 2014, 2014 11th International Symposium on Wireless Communications Systems (ISWCS).

[15]  Alireza Sadeghi,et al.  Reinforcement Learning for Adaptive Caching With Dynamic Storage Pricing , 2018, IEEE Journal on Selected Areas in Communications.

[16]  Gang Wang,et al.  Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization , 2018, IEEE Transactions on Signal Processing.

[17]  Paolo Giaccone,et al.  Temporal locality in today's content caching: why it matters and how to model it , 2013, CCRV.

[18]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[19]  Mehdi Bennis,et al.  Living on the edge: The role of proactive caching in 5G wireless networks , 2014, IEEE Communications Magazine.

[20]  Deniz Gündüz,et al.  Learning-based optimization of cache content in a small cell base station , 2014, 2014 IEEE International Conference on Communications (ICC).

[21]  Erik Dahlman,et al.  4G: LTE/LTE-Advanced for Mobile Broadband , 2011 .

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  Jim Gao,et al.  Machine Learning Applications for Data Center Optimization , 2014 .

[24]  Dusit Niyato,et al.  Decentralized Caching for Content Delivery Based on Blockchain: A Game Theoretic Perspective , 2018, 2018 IEEE International Conference on Communications (ICC).

[25]  S. RaijaSulthana Distributed caching algorithms for content distribution networks , 2015 .

[26]  Donald F. Towsley,et al.  The Role of Caching in Future Communication Systems and Networks , 2018, IEEE Journal on Selected Areas in Communications.

[27]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[28]  Ronald Fagin,et al.  Asymptotic Miss Ratios over Independent References , 1977, J. Comput. Syst. Sci..

[29]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[30]  Alireza Sadeghi,et al.  Optimal and Scalable Caching for 5G Using Reinforcement Learning of Space-Time Popularities , 2017, IEEE Journal of Selected Topics in Signal Processing.

[31]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[32]  Andre Cohen,et al.  An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[33]  Bo Li,et al.  Collaborative hierarchical caching with dynamic request routing for massive content distribution , 2012, 2012 Proceedings IEEE INFOCOM.

[34]  Peter Auer,et al.  Using Confidence Bounds for Exploitation-Exploration Trade-offs , 2003, J. Mach. Learn. Res..

[35]  Deniz Gündüz,et al.  A Reinforcement-Learning Approach to Proactive Caching in Wireless Networks , 2017, IEEE Journal on Selected Areas in Communications.

[36]  M. Draief,et al.  Placing dynamic content in caches with small population , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[37]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[38]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.