Deep reinforcement learning applied to the k-server problem

Abstract The reinforcement learning paradigm has been shown to be an effective approach in solving the k-server problem. However, this approach is based on the Q-learning algorithm, being subjected to the curse of dimensionality problem, since the action-value function (Q-function) grows exponentially with the increase in the number of states and actions. In this work, a new algorithm based on the deep reinforcement learning paradigm is proposed. For this, the Q-function is defined by a multilayer perceptron neural network that extracts the information of the environment from images that encode the dynamics of the problem. The applicability of the proposed algorithm is illustrated in a case study in which different nodes and servers problem configurations are considered. The agents behavior is analyzed during the training phase and its efficiency is evaluated from performance tests that quantify the quality of the generated server displacement policies. The results obtained provide a new algorithm promising view as an alternative solution to the k-server problem.

[1]  Allan Borodin,et al.  Online computation and competitive analysis , 1998 .

[2]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[3]  J. Fernando Sánchez-Rada,et al.  Enhancing deep learning sentiment analysis with ensemble techniques in social applications , 2020 .

[4]  Srikanth Kandula,et al.  Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[5]  A.D. Doria Neto,et al.  The k-server problem: a reinforcement learning approach , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[6]  Joseph Naor,et al.  A Polylogarithmic-Competitive Algorithm for the k-Server Problem , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[7]  Robert E. Tarjan,et al.  Amortized efficiency of list update and paging rules , 1985, CACM.

[8]  Mourad Debbabi,et al.  Evolutionary learning algorithm for reliable facility location under disruption , 2019, Expert Syst. Appl..

[9]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[10]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[11]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[12]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[13]  Long-Ji Lin,et al.  Reinforcement learning for robots using neural networks , 1992 .

[14]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[15]  Alejandro López-Ortiz,et al.  On the Advice Complexity of the k-server Problem Under Sparse Metrics , 2013, Theory of Computing Systems.

[16]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[17]  Patrick Jaillet,et al.  The K-server problem via a modern optimization lens , 2019, Eur. J. Oper. Res..

[18]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[19]  Elias Koutsoupias,et al.  The k-server problem , 2009, Comput. Sci. Rev..

[20]  Adrião Duarte Dória Neto,et al.  Hierarchical Reinforcement Learning and Parallel Computing Applied to the k-server Problem , 2016 .

[21]  Martin A. Riedmiller Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.

[22]  Lyle A. McGeoch,et al.  Competitive algorithms for on-line problems , 1988, STOC '88.

[23]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .

[24]  Alfonzo Baumgartner,et al.  A fast work function algorithm for solving the k-server problem , 2013, Central Eur. J. Oper. Res..