SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning

Robust query optimization becomes illusory in the presence of correlated predicates or user-defined functions. Occasionally, the query optimizer will choose join orders whose execution time is by many orders of magnitude higher than necessary. We present SkinnerDB, a novel database management system that is designed from the ground up for reliable optimization and robust performance. SkinnerDB implements several adaptive query processing strategies based on reinforcement learning. We divide the execution of a query into small time periods in which different join orders are executed. Thereby, we converge to optimal join orders with regret bounds, meaning that the expected difference between actual execution time and time for an optimal join order is bounded. To the best of our knowledge, our execution strategies are the first to provide comparable formal guarantees. SkinnerDB can be used as a layer on top of any existing database management system. We use optimizer hints to force existing systems to try out different join orders, carefully restricting execution time per join order and data batch via timeouts. We choose timeouts according to an iterative scheme that balances execution time over different timeouts to guarantee bounded regret. Alternatively, SkinnerDB can be used as a standalone, featuring an execution engine that is tailored to the requirements of join order learning. In particular, we use a specialized multi-way join algorithm and a concise tuple representation to facilitate fast switches between join orders. In our demonstration, we let participants experiment with different query types and databases. We visualize the learning process and compare against baselines.

[1]  Michèle Sebag,et al.  The grand challenge of computer Go , 2012, Commun. ACM.

[2]  Lin Ma,et al.  Self-Driving Database Management Systems , 2017, CIDR.

[3]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[4]  Jeffrey F. Naughton,et al.  Practical selectivity estimation through adaptive sampling , 1990, SIGMOD '90.

[5]  Joseph M. Hellerstein,et al.  Using state modules for adaptive query processing , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[6]  Peter J. Haas,et al.  Sequential sampling procedures for query size estimation , 1992, SIGMOD '92.

[7]  P.J. Haas,et al.  Sampling-based selectivity estimation for joins using augmented frequent value statistics , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[8]  Jayant R. Haritsa,et al.  QUEST: An Exploratory Approach to Robust Query Processing , 2014, Proc. VLDB Endow..

[9]  Calisto Zuzarte,et al.  StatAdvisor: Recommending Statistical Views , 2009, Proc. VLDB Endow..

[10]  Surajit Chaudhuri,et al.  Robust Estimation of Resource Consumption for SQL Queries using Statistical Techniques , 2012, Proc. VLDB Endow..

[11]  Archana Ganapathi,et al.  Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[12]  Sven Helmer,et al.  Ordering Selection Operators Under Partial Ignorance , 2015, CIKM.

[13]  Anastasia Ailamaki,et al.  PREDIcT: Towards Predicting the Runtime of Large Scale Iterative Analytics , 2013, Proc. VLDB Endow..

[14]  Surajit Chaudhuri,et al.  Automating Statistics Management for Query Optimizers , 2001, IEEE Trans. Knowl. Data Eng..

[15]  Peter J. Haas,et al.  Automated Statistics Collection in DB2 UDB , 2004, VLDB.

[16]  Jeffrey F. Naughton,et al.  Sampling-Based Query Re-Optimization , 2016, SIGMOD Conference.

[17]  Eli Upfal,et al.  Performance prediction for concurrent database workloads , 2011, SIGMOD '11.

[18]  Jayant R. Haritsa,et al.  Identifying robust plans through plan diagram reduction , 2008, Proc. VLDB Endow..

[19]  Jeffrey F. Naughton,et al.  Maximizing the Output Rate of Multi-Way Join Queries over Streaming Information Sources , 2003, VLDB.

[20]  Amol Deshpande,et al.  An initial study of overheads of eddies , 2004, SGMD.

[21]  N.V. Chawla,et al.  Estimating Query Result Sizes for Proxy Caching in Scientific Database Federations , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[22]  Christian S. Jensen,et al.  A Reinforcement Learning Approach for Adaptive Query Processing , 2008 .

[23]  Quanzhong Li,et al.  Adaptively Reordering Joins during Query Execution , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[24]  Martin L. Kersten,et al.  Breaking the memory wall in MonetDB , 2008, CACM.

[25]  J. S. Saini,et al.  Adaptive Query Processing , 2006 .

[26]  Jayant R. Haritsa,et al.  Plan bouquets: query processing without selectivity estimation , 2014, SIGMOD Conference.

[27]  Eli Upfal,et al.  Learning-based Query Performance Modeling and Prediction , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[28]  Surajit Chaudhuri,et al.  Exploiting statistics on query expressions for optimization , 2002, SIGMOD '02.

[29]  Volker Markl,et al.  A learning optimizer for a federated database management system , 2005, Informatik - Forschung und Entwicklung.

[30]  Thomas Neumann,et al.  Taking the Edge off Cardinality Estimation Errors using Incremental Execution , 2013, BTW.

[31]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[32]  Volker Markl,et al.  LEO - DB2's LEarning Optimizer , 2001, VLDB.

[33]  Surajit Chaudhuri,et al.  Towards a robust query optimizer: a principled and practical approach , 2005, SIGMOD '05.

[34]  Andrey Balmin,et al.  Dynamically optimizing queries over large scale data platforms , 2014, SIGMOD Conference.

[35]  Nitesh V. Chawla,et al.  A Black-Box Approach to Query Cardinality Estimation , 2007, CIDR.

[36]  Khaled Hamed Alyoubi,et al.  Database query optimisation based on measures of regret , 2016 .

[37]  David J. DeWitt,et al.  Proactive re-optimization , 2005, SIGMOD '05.

[38]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[39]  Viktor Leis,et al.  How Good Are Query Optimizers, Really? , 2015, Proc. VLDB Endow..