UDO: Universal Database Optimization using Reinforcement Learning

UDO is a versatile tool for offline tuning of database systems for specific workloads. UDO can consider a variety of tuning choices, reaching from picking transaction code variants over index selections up to database system parameter tuning. UDO uses reinforcement learning to converge to near-optimal configurations, creating and evaluating different configurations via actual query executions (instead of relying on simplifying cost models). To cater to different parameter types, UDO distinguishes heavy parameters (which are expensive to change, e.g. physical design parameters) from light parameters. Specifically for optimizing heavy parameters, UDO uses reinforcement learning algorithms that allow delaying the point at which the reward feedback becomes available. This gives us the freedom to optimize the point in time and the order in which different configurations are created and evaluated (by benchmarking a workload sample). UDO uses a cost-based planner to minimize reconfiguration overheads. For instance, it aims to amortize the creation of expensive data structures by consecutively evaluating configurations using them. We evaluate UDO on Postgres as well as MySQL and on TPC-H as well as TPC-C, optimizing a variety of light and heavy parameters concurrently.

[1]  Ke Zhou,et al.  An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning , 2019, SIGMOD Conference.

[2]  Immanuel Trummer,et al.  Exact Cardinality Query Optimization with Bounded Execution Cost , 2019, SIGMOD Conference.

[3]  Tao Dai,et al.  A Demonstration of the OtterTune Automatic Database Management System Tuning Service , 2018, Proc. VLDB Endow..

[4]  David Silver,et al.  Combining online and offline knowledge in UCT , 2007, ICML '07.

[5]  Immanuel Trummer,et al.  SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning , 2018, Proc. VLDB Endow..

[6]  Csaba Szepesvári,et al.  –armed Bandits , 2022 .

[7]  Adith Swaminathan,et al.  Active Learning for ML Enhanced Database Systems , 2020, SIGMOD Conference.

[8]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[9]  Andreas Kipf,et al.  Learned Cardinalities: Estimating Correlated Joins with Deep Learning , 2018, CIDR.

[10]  Surajit Chaudhuri,et al.  Exact Cardinality Query Optimization for Optimizer Testing , 2009, Proc. VLDB Endow..

[11]  Csaba Szepesvári,et al.  Tuning Bandit Algorithms in Stochastic Environments , 2007, ALT.

[12]  Immanuel Trummer,et al.  Demonstrating UDO: A Unified Approach for Optimizing Transaction Code, Physical Design, and System Parameters via Reinforcement Learning , 2021, SIGMOD Conference.

[13]  Badrish Chandramouli,et al.  Qd-tree: Learning Data Layouts for Big Data Analytics , 2020, SIGMOD Conference.

[14]  Rémi Munos,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[15]  Surajit Chaudhuri,et al.  AI Meets AI: Leveraging Query Executions to Improve Index Recommendations , 2019, SIGMOD Conference.

[16]  M. Gribaudo,et al.  2002 , 2001, Cell and Tissue Research.

[17]  Carsten Binnig,et al.  Towards learning a partitioning advisor with deep reinforcement learning , 2019, aiDM@SIGMOD.

[18]  Surajit Chaudhuri,et al.  Index selection for databases: a hardness study and a principled heuristic solution , 2004, IEEE Transactions on Knowledge and Data Engineering.

[19]  Andrew Pavlo,et al.  An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems , 2021, Proc. VLDB Endow..

[20]  András György,et al.  Online Learning under Delayed Feedback , 2013, ICML.

[21]  Wolfgang Lehner,et al.  Cardinality estimation with local deep learning models , 2019, aiDM@SIGMOD.

[22]  Le Gruenwald,et al.  DRLindex: deep reinforcement learning index advisor for a cluster database , 2020, IDEAS.

[23]  Kristie B. Hadden,et al.  2020 , 2020, Journal of Surgical Orthopaedic Advances.

[24]  Barzan Mozafari,et al.  QuickSel: Quick Selectivity Learning with Mixture Models , 2018, SIGMOD Conference.

[25]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[26]  Anastasia Ailamaki,et al.  Automated physical designers: what you see is (not) what you get , 2012, DBTest '12.

[27]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[28]  Viktor Leis,et al.  How Good Are Query Optimizers, Really? , 2015, Proc. VLDB Endow..

[29]  Jens Dittrich,et al.  The Case for Automatic Database Administration using Deep Reinforcement Learning , 2018, ArXiv.

[30]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[31]  Ian Stansfield,et al.  Knowing when to stop , 1994, Nature.

[32]  Alvin Cheung,et al.  Leveraging Lock Contention to Improve OLTP Application Performance , 2016, Proc. VLDB Endow..

[33]  Carsten Binnig,et al.  Learning a Partitioning Advisor for Cloud Databases , 2020, SIGMOD Conference.