Adaptive indexing in modern database kernels

Physical design represents one of the hardest problems for database management systems. Without proper tuning, systems cannot achieve good performance. Offline indexing creates indexes a priori assuming good workload knowledge and idle time. More recently, online indexing monitors the workload trends and creates or drops indexes online. Adaptive indexing takes another step towards completely automating the tuning process of a database system, by enabling incremental and partial online indexing. The main idea is that physical design changes continuously, adaptively, partially, incrementally and on demand while processing queries as part of the execution operators. As such it brings a plethora of opportunities for rethinking and improving every single corner of database system design. We will analyze the indexing space between offline, online and adaptive indexing through several state of the art indexing techniques, e. g., what-if analysis and soft indexes. We will discuss in detail adaptive indexing techniques such as database cracking, adaptive merging, sideways cracking and various hybrids that try to balance the online tuning overhead with the convergence speed to optimal performance. In addition, we will discuss how various aspects of modern techniques for database architectures, such as vectorization, bulk processing, column-store execution and storage affect adaptive indexing. Finally, we will discuss several open research topics towards fully automomous database kernels.

[1]  Martin L. Kersten,et al.  Self-organizing tuple reconstruction in column-stores , 2009, SIGMOD Conference.

[2]  Surajit Chaudhuri,et al.  An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server , 1997, VLDB.

[3]  Martin L. Kersten,et al.  Database Cracking , 2007, CIDR.

[4]  Surajit Chaudhuri,et al.  Physical Design Refinement: The "Merge-Reduce" Approach , 2006, EDBT.

[5]  Surajit Chaudhuri,et al.  To tune or not to tune?: a lightweight physical design alerter , 2006, VLDB.

[6]  Harumi A. Kuno,et al.  Self-selecting, self-tuning, incrementally optimized indexes , 2010, EDBT '10.

[7]  Harumi A. Kuno,et al.  Merging What's Cracked, Cracking What's Merged: Adaptive Indexing in Main-Memory Column-Stores , 2011, Proc. VLDB Endow..

[8]  Michael Stonebraker,et al.  The case for partial indexes , 1989, SGMD.

[9]  Karsten Schmidt,et al.  Autonomous Management of Soft Indexes , 2007, 2007 IEEE 23rd International Conference on Data Engineering Workshop.

[10]  Martin L. Kersten,et al.  Updating a cracked database , 2007, SIGMOD '07.

[11]  Harumi A. Kuno,et al.  Adaptive indexing for relational keys , 2010, 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010).

[12]  Serge Abiteboul,et al.  COLT: continuous on-line tuning , 2006, SIGMOD Conference.

[13]  Sam Lightstone,et al.  Physical Database Design for Relational Databases , 2009, Encyclopedia of Database Systems.

[14]  Theo Härder Selecting an Optimal Set of Secondary Indices , 1976, ECI.

[15]  Surajit Chaudhuri,et al.  An Online Approach to Physical Design Tuning , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[16]  Daniel C. Zilio,et al.  DB2 advisor: an optimizer smart enough to recommend its own indexes , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[17]  Serge Abiteboul,et al.  COLT: Continuous On-Line Database Tuning , 2006 .

[18]  Sam Lightstone,et al.  DB2 Design Advisor: Integrated Automatic Physical Database Design , 2004, VLDB.

[19]  Surajit Chaudhuri,et al.  Database Tuning Advisor for Microsoft SQL Server 2005 , 2004, VLDB.

[20]  Harumi A. Kuno,et al.  Benchmarking Adaptive Indexing , 2010, TPCTC.

[21]  Surajit Chaudhuri,et al.  Table of Contents (pdf) , 2007, VLDB.

[22]  Praveen Seshadri,et al.  Generalized partial indexes , 1995, Proceedings of the Eleventh International Conference on Data Engineering.