Learning Index Selection with Structured Action Spaces

Configuration spaces for computer systems can be challenging for traditional and automatic tuning strategies. Injecting task-specific knowledge into the tuner for a task may allow for more efficient exploration of candidate configurations. We apply this idea to the task of index set selection to accelerate database workloads. Index set selection has been amenable to recent applications of vanilla deep RL, but real deployments remain out of reach. In this paper, we explore how learning index selection can be enhanced with task-specific inductive biases, specifically by encoding these inductive biases in better action structures. Index selection-specific action representations arise when the problem is reformulated in terms of permutation learning and we rely on recent work for learning RL policies on permutations. Through this approach, we build an indexing agent that is able to achieve improved indexing and validate its behavior with task-specific statistics. Early experiments reveal that our agent can find configurations that are up to 40% smaller for the same levels of latency as compared with other approaches and indicate more intuitive indexing behavior.

[1]  Geoffrey E. Hinton,et al.  Reinforcement Learning with Factored States and Actions , 2004, J. Mach. Learn. Res..

[2]  Peter Sunehag,et al.  Reinforcement Learning in Large Discrete Action Spaces , 2015, ArXiv.

[3]  Ryan P. Adams,et al.  Ranking via Sinkhorn Propagation , 2011, ArXiv.

[4]  Yuxi Li,et al.  Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[5]  David Silver,et al.  Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.

[6]  Scott W. Linderman,et al.  Learning Latent Permutations with Gumbel-Sinkhorn Networks , 2018, ICLR.

[7]  Shoaib Kamil,et al.  OpenTuner: An extensible framework for program autotuning , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[8]  Valentin Dalibard,et al.  BOAT: Building Auto-Tuners with Structured Bayesian Optimization , 2017, WWW.

[9]  Anil A. Bharath,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[10]  Eiko Yoneki,et al.  RLgraph: Flexible Computation Graphs for Deep Reinforcement Learning , 2018, ArXiv.

[11]  Olga Papaemmanouil,et al.  Towards a Hands-Free Query Optimizer through Deep Learning , 2018, CIDR.

[12]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[13]  Rajarshi Das,et al.  A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation , 2006, 2006 IEEE International Conference on Autonomic Computing.

[14]  Sanjay Ranka,et al.  Learning Permutations with Sinkhorn Policy Gradient , 2018, ArXiv.

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  Srikanth Kandula,et al.  Resource Management with Deep Reinforcement Learning , 2016, HotNets.

[17]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[18]  Quoc V. Le,et al.  Hierarchical Planning for Device Placement , 2018 .

[19]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[20]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[21]  Benjamin Recht,et al.  Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.

[22]  Eiko Yoneki,et al.  RLgraph: Modular Computation Graphs for Deep Reinforcement Learning , 2019, MLSys.

[23]  Yiming Yang,et al.  DARTS: Differentiable Architecture Search , 2018, ICLR.

[24]  Michael L. Littman,et al.  Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.

[25]  Richard Evans,et al.  Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.

[26]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[27]  Xi Chen,et al.  Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.

[28]  Arash Tavakoli,et al.  Action Branching Architectures for Deep Reinforcement Learning , 2017, AAAI.

[29]  Eiko Yoneki,et al.  LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations , 2018, ArXiv.

[30]  Philip S. Thomas,et al.  Learning Action Representations for Reinforcement Learning , 2019, ICML.

[31]  Jason Pazis,et al.  Generalized Value Functions for Large Action Sets , 2011, ICML.

[32]  Tom Schaul,et al.  Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.

[33]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[34]  Richard Sinkhorn,et al.  Concerning nonnegative matrices and doubly stochastic matrices , 1967 .

[35]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[36]  Samy Bengio,et al.  Device Placement Optimization with Reinforcement Learning , 2017, ICML.

[37]  Jens Dittrich,et al.  The Case for Automatic Database Administration using Deep Reinforcement Learning , 2018, ArXiv.

[38]  Marco Wiering,et al.  Using continuous action spaces to solve discrete problems , 2009, 2009 International Joint Conference on Neural Networks.