MANTIS: Multiple Type and Attribute Index Selection using Deep Reinforcement Learning

DBMS performance is dependent on many parameters, such as index selection, cache size, physical layout, and data partitioning. Some combinations of these parameters can lead to optimal performance for a given workload but selecting an optimal or near-optimal combination is challenging, especially for large databases with complex workloads. Among the hundreds of parameters, index selection is arguably the most critical parameter for performance. We propose a self-administered framework, called the Multiple Type and Attribute Index Selector (MANTIS), that automatically selects near-optimal indexes. The framework advances the state-of-the-art index selection by considering both multi-attribute and multiple types of indexes within a bounded storage size constraint, a combination not previously addressed. MANTIS combines supervised and reinforcement learning, a Deep Neural Network recommends the type of index for a given workload while a Deep Q-Learning network recommends the multi-attribute aspect. MANTIS is sensitive to storage cost constraints and incorporates noisy rewards in its reward function for better performance. Our experimental evaluation shows that MANTIS outperforms the current state-of-art methods by an average of 9.53% QphH@size.

[1]  Yuxi Li,et al.  Deep Reinforcement Learning: An Overview , 2017, ArXiv.

[2]  Piatetsky-ShapiroGregory The optimal selection of secondary indices is NP-complete , 1983 .

[3]  Lin Ma,et al.  External vs. Internal: An Essay on Machine Learning Agents for Autonomous Database Management Systems , 2019, IEEE Data Eng. Bull..

[4]  Badrish Chandramouli,et al.  Qd-tree: Learning Data Layouts for Big Data Analytics , 2020, SIGMOD Conference.

[5]  Zhitang Chen,et al.  Causal Discovery with Reinforcement Learning , 2019, ICLR.

[6]  Qiang Ma,et al.  Combinatorial Optimization by Graph Pointer Networks and Hierarchical Reinforcement Learning , 2019, ArXiv.

[7]  Felipe Meneguzzi,et al.  Automated Database Indexing using Model-free Reinforcement Learning , 2020, ArXiv.

[8]  Magdalena Balazinska,et al.  Learning State Representations for Query Optimization with Deep Reinforcement Learning , 2018, DEEM@SIGMOD.

[9]  Tom Schaul,et al.  Prioritized Experience Replay , 2015, ICLR.

[10]  Christopher Ré,et al.  ML-In-Databases: Assessment and Prognosis , 2021, IEEE Data Eng. Bull..

[11]  Guoliang Li,et al.  CDBTune+\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {CDBTune}^{+}$$\end{document}: An efficient deep reinfor , 2021, The VLDB journal.

[12]  Z. Bao,et al.  An Index Advisor Using Deep Reinforcement Learning , 2020, CIKM.

[13]  Ion Stoica,et al.  Learning to Optimize Join Queries With Deep Reinforcement Learning , 2018, ArXiv.

[14]  Lawrence V. Snyder,et al.  Reinforcement Learning for Solving the Vehicle Routing Problem , 2018, NeurIPS.

[15]  Surajit Chaudhuri,et al.  AI Meets AI: Leveraging Query Executions to Improve Index Recommendations , 2019, SIGMOD Conference.

[16]  Guoliang Li,et al.  Reinforcement Learning with Tree-LSTM for Join Order Selection , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[17]  Jens Dittrich,et al.  The Case for Automatic Database Administration using Deep Reinforcement Learning , 2018, ArXiv.

[18]  Guoliang Li,et al.  QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning , 2019, Proc. VLDB Endow..

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Olga Papaemmanouil,et al.  Buffer Pool Aware Query Scheduling via Deep Reinforcement Learning , 2020, AIDB@VLDB.

[21]  Gregory Piatetsky-Shapiro,et al.  The optimal selection of secondary indices is NP-complete , 1983, SGMD.

[22]  Carsten Binnig,et al.  Towards learning a partitioning advisor with deep reinforcement learning , 2019, aiDM@SIGMOD.

[23]  Eiko Yoneki,et al.  Learning Index Selection with Structured Action Spaces , 2019, ArXiv.

[24]  Viktor Leis,et al.  How Good Are Query Optimizers, Really? , 2015, Proc. VLDB Endow..

[25]  Olga Papaemmanouil,et al.  Deep Reinforcement Learning for Join Order Enumeration , 2018, aiDM@SIGMOD.

[26]  G. Chang,et al.  Delay-aware Cellular Traffic Scheduling with Deep Reinforcement Learning , 2020, Global Communications Conference.

[27]  Verena Kantere,et al.  Automated Management of Indexes for Dataflow Processing Engines in IaaS Clouds , 2020, EDBT.

[28]  Ke Zhou,et al.  An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning , 2019, SIGMOD Conference.

[29]  Felipe Meneguzzi,et al.  SmartIX: A database indexing agent based on reinforcement learning , 2020, Applied Intelligence.

[30]  Le Gruenwald,et al.  DRLindex: deep reinforcement learning index advisor for a cluster database , 2020, IDEAS.

[31]  Geoffrey J. Gordon,et al.  Automatic Database Management System Tuning Through Large-scale Machine Learning , 2017, SIGMOD Conference.

[32]  Guoliang Li,et al.  AI Meets Database: AI4DB and DB4AI , 2021, SIGMOD Conference.

[33]  Immanuel Trummer,et al.  SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning , 2018, Proc. VLDB Endow..

[34]  Olga Papaemmanouil,et al.  Towards a Hands-Free Query Optimizer through Deep Learning , 2018, CIDR.

[35]  Gunter Saake,et al.  GridFormation: Towards Self-Driven Online Data Partitioning using Reinforcement Learning , 2018, aiDM@SIGMOD.