A Unified and Efficient Coordinating Framework for Autonomous DBMS Tuning

Recently using machine learning (ML) based techniques to optimize modern database management systems has attracted intensive interest from both industry and academia. With an objective to tune a specific component of a DBMS (e.g., index selection, knobs tuning), the ML-based tuning agents have shown to be able to find better configurations than experienced database administrators. However, one critical yet challenging question remains unexplored -- how to make those ML-based tuning agents work collaboratively. Existing methods do not consider the dependencies among the multiple agents, and the model used by each agent only studies the effect of changing the configurations in a single component. To tune different components for DBMS, a coordinating mechanism is needed to make the multiple agents cognizant of each other. Also, we need to decide how to allocate the limited tuning budget among the agents to maximize the performance. Such a decision is difficult to make since the distribution of the reward for each agent is unknown and non-stationary. In this paper, we study the above question and present a unified coordinating framework to efficiently utilize existing ML-based agents. First, we propose a message propagation protocol that specifies the collaboration behaviors for agents and encapsulates the global tuning messages in each agent's model. Second, we combine Thompson Sampling, a well-studied reinforcement learning algorithm with a memory buffer so that our framework can allocate budget judiciously in a non-stationary environment. Our framework defines the interfaces adapted to a broad class of ML-based tuning agents, yet simple enough for integration with existing implementations and future extensions. We show that it can effectively utilize different ML-based agents and find better configurations with 1.4~14.1X speedups on the workload execution time compared with baselines.

[1]  Shiyu Huang,et al.  Survey on performance optimization for database systems , 2023, Science China Information Sciences.

[2]  Shuai Han,et al.  Efficient Partitioning Method for Optimizing the Compression on Array Data , 2022, Journal of Computer Science and Technology.

[3]  M. Zhang,et al.  A review of machine learning-based failure management in optical networks , 2022, Science China Information Sciences.

[4]  Yan Zhao,et al.  Efficient Join Order Selection Learning with Graph-based Representation , 2022, KDD.

[5]  Jianling Gao,et al.  Automatic index selection with learned cost estimator , 2022, Inf. Sci..

[6]  Asaf Cidon,et al.  Neuroshard: towards automatic multi-objective sharding with deep reinforcement learning , 2022, aiDM@SIGMOD.

[7]  Yu Liu,et al.  HUNTER: An Online Cloud Database Hybrid Tuning System for Personalized Requirements , 2022, SIGMOD Conference.

[8]  Jinyang Li,et al.  WeTune: Automatic Discovery and Verification of Query Rewrite Rules , 2022, SIGMOD Conference.

[9]  P. Bernstein,et al.  Budget-aware Index Tuning with Reinforcement Learning , 2022, SIGMOD Conference.

[10]  Ce Zhang,et al.  Transfer Learning based Search Space Design for Hyperparameter Tuning , 2022, KDD.

[11]  J. Zhao,et al.  Dynamic Index Construction with Deep Reinforcement Learning , 2022, Data Science and Engineering.

[12]  Jianhua Feng,et al.  AutoIndex: An Incremental Index Management System for Dynamic Workloads , 2022, 2022 IEEE 38th International Conference on Data Engineering (ICDE).

[13]  C. Dyreson,et al.  Indexer++: workload-aware online index tuning with transformers and reinforcement learning , 2022, SAC.

[14]  Bin Cui,et al.  Towards Dynamic and Safe Configuration Tuning for Cloud Databases , 2022, SIGMOD Conference.

[15]  S. Venkataraman,et al.  LlamaTune: Sample-Efficient DBMS Configuration Tuning , 2022, Proc. VLDB Endow..

[16]  Yi Wang,et al.  Proactive and intelligent evaluation of big data queries in edge clouds with materialized views , 2021, Comput. Networks.

[17]  Bin Cui,et al.  Facilitating Database Tuning with Hyper-Parameter Optimization: A Comprehensive Experimental Evaluation , 2021, Proc. VLDB Endow..

[18]  Bolin Ding,et al.  VolcanoML: speeding up end-to-end AutoML via scalable search space decomposition , 2021, The VLDB Journal.

[19]  R. Schlosser,et al.  SWIRL: Selection of Workload-aware Indexes using Reinforcement Learning , 2022, EDBT.

[20]  Jianhua Feng,et al.  A Learned Query Rewrite System using Monte Carlo Tree Search , 2021, Proc. VLDB Endow..

[21]  Curtis E. Dyreson,et al.  MANTIS: Multiple Type and Attribute Index Selection using Deep Reinforcement Learning , 2021, IDEAS.

[22]  Xuanhe Zhou,et al.  Machine Learning for Databases , 2021, Proc. VLDB Endow..

[23]  Xinyi Zhang,et al.  ResTune: Resource Oriented Tuning Boosted by Meta-Learning for Cloud Databases , 2021, SIGMOD Conference.

[24]  Immanuel Trummer,et al.  UDO: Universal Database Optimization using Reinforcement Learning , 2021, Proc. VLDB Endow..

[25]  Andrew Pavlo,et al.  An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems , 2021, Proc. VLDB Endow..

[26]  Zhifeng Bao,et al.  A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration , 2021, Data Science and Engineering.

[27]  Benjamin I. P. Rubinstein,et al.  DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees , 2020, 2021 IEEE 37th International Conference on Data Engineering (ICDE).

[28]  Paolo Cremonesi,et al.  CGPTuner: a Contextual Gaussian Process Bandit Approach for the Automatic Tuning of IT Configurations Under Varying Workload Conditions , 2021, Proc. VLDB Endow..

[29]  Arun Iyengar,et al.  Lachesis: Automated Partitioning for UDF-Centric Analytics , 2021, Proc. VLDB Endow..

[30]  Berthold Reinwald,et al.  Adaptive Multi-Model Reinforcement Learning for Online Database Tuning , 2021, EDBT.

[31]  Z. Bao,et al.  An Index Advisor Using Deep Reinforcement Learning , 2020, CIKM.

[32]  Stefan Halfpap,et al.  Magic mirror in my hand, which is the best in the land? , 2020, Proc. VLDB Endow..

[33]  Carsten Binnig,et al.  Learning a Partitioning Advisor for Cloud Databases , 2020, SIGMOD Conference.

[34]  Lucian Carata,et al.  To Tune or Not to Tune?: In Search of Optimal Configurations for Data Analytics , 2020, KDD.

[35]  Jiawei Jiang,et al.  Efficient Automatic CASH via Rising Bandits , 2020, AAAI.

[36]  Guoliang Li,et al.  Automatic View Generation with Deep Learning and Reinforcement Learning , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[37]  Guoliang Li,et al.  Reinforcement Learning with Tree-LSTM for Join Order Selection , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[38]  Le Gruenwald,et al.  Online Index Selection Using Deep Reinforcement Learning for a Cluster Database , 2020, 2020 IEEE 36th International Conference on Data Engineering Workshops (ICDEW).

[39]  Felipe Meneguzzi,et al.  SmartIX: A database indexing agent based on reinforcement learning , 2020, Applied Intelligence.

[40]  Masahito Shiba,et al.  Dynamic Configuration Tuning of Working Database Management Systems , 2020, 2020 IEEE 2nd Global Conference on Life Sciences and Technologies (LifeTech).

[41]  Shivnath Babu,et al.  Black or White? How to Develop an AutoTuner for Memory-based Analytics , 2020, SIGMOD Conference.

[42]  Alexander G. Gray,et al.  An ADMM Based Framework for AutoML Pipeline Configuration , 2019, AAAI.

[43]  Shivaram Venkataraman,et al.  Too Many Knobs to Tune? Towards Faster Database Tuning by Pre-selecting Important Knobs , 2020, HotStorage.

[44]  Kurt Stockinger,et al.  Join Query Optimization with Deep Reinforcement Learning Algorithms , 2019, ArXiv.

[45]  Gunter Saake,et al.  Automated Vertical Partitioning with Deep Reinforcement Learning , 2019, ADBIS.

[46]  Guoliang Li,et al.  QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning , 2019, Proc. VLDB Endow..

[47]  Surajit Chaudhuri,et al.  AI Meets AI: Leveraging Query Executions to Improve Index Recommendations , 2019, SIGMOD Conference.

[48]  Feifei Li,et al.  iBTune: Individualized Buffer Tuning for Large-scale Cloud Databases , 2019, Proc. VLDB Endow..

[49]  Sanjay Krishnan,et al.  Opportunistic View Materialization with Deep Reinforcement Learning , 2019, ArXiv.

[50]  Lin Ma,et al.  External vs. Internal: An Essay on Machine Learning Agents for Autonomous Database Management Systems , 2019, IEEE Data Eng. Bull..

[51]  Ion Stoica,et al.  Learning to Optimize Join Queries With Deep Reinforcement Learning , 2018, ArXiv.

[52]  Xiaoyong Du,et al.  MSQL+: a Plugin Toolkit for Similarity Search under Metric Spaces in Distributed Relational Database Systems , 2018, Proc. VLDB Endow..

[53]  Gunter Saake,et al.  GridFormation: Towards Self-Driven Online Data Partitioning using Reinforcement Learning , 2018, aiDM@SIGMOD.

[54]  Olga Papaemmanouil,et al.  Deep Reinforcement Learning for Join Order Enumeration , 2018, aiDM@SIGMOD.

[55]  Daniel Lemire,et al.  Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources , 2018, SIGMOD Conference.

[56]  Peter Stone,et al.  Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces , 2017, AAAI.

[57]  Benjamin Van Roy,et al.  A Tutorial on Thompson Sampling , 2017, Found. Trends Mach. Learn..

[58]  Geoffrey J. Gordon,et al.  Automatic Database Management System Tuning Through Large-scale Machine Learning , 2017, SIGMOD Conference.

[59]  Viktor Leis,et al.  How Good Are Query Optimizers, Really? , 2015, Proc. VLDB Endow..

[60]  Omar Besbes,et al.  Stochastic Multi-Armed-Bandit Problem with Non-stationary Rewards , 2014, NIPS.

[61]  Nando de Freitas,et al.  Bayesian Optimization in High Dimensions via Random Embeddings , 2013, IJCAI.

[62]  Sébastien Bubeck,et al.  Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..

[63]  Shipra Agrawal,et al.  Analysis of Thompson Sampling for the Multi-armed Bandit Problem , 2011, COLT.

[64]  Wei Chu,et al.  Contextual Bandits with Linear Payoff Functions , 2011, AISTATS.

[65]  Shivnath Babu,et al.  Tuning Database Configuration Parameters with iTuned , 2009, Proc. VLDB Endow..

[66]  Surajit Chaudhuri,et al.  An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server , 1997, VLDB.

[67]  Béatrice Finance,et al.  A rule-based query rewriter in an extensible DBMS , 1991, [1991] Proceedings. Seventh International Conference on Data Engineering.