Autonomous workload-driven reorganization of column groupings in MMDBS

A current trend to achieve high query performance even for huge data warehouse and business intelligence systems is to exploit main-memory-based processing techniques such as compression, cache-conscious strategies, and optimized data structures. However, update processing and skews in data distribution might lead to degenerations in such densely packed and highly compressed data structures affecting the memory efficiency and query performance negatively. Thus, reorganization tasks for repairing these data structures are necessary but should be carefully applied in order to not impact query execution or even system availability significantly. In this paper, we consider the special problem of tuple layout in banked storage structures. Based on runtime statistics capturing typical access patterns in the current workload, we present a bank reassignment approach that can be piggybacked to maintenance tasks without any administration overhead. We have implemented this approach in IBM Smart Analytics Optimizer (ISAOPT). The results of our experimental evaluation show that a simple automatic restructuring of the considered hybrid row-column-store structures offers opportunities to improve query runtimes when a slight memory overhead is acceptable.

[1]  Marcin Zukowski,et al.  MonetDB/X100 - A DBMS In The CPU Cache , 2005, IEEE Data Eng. Bull..

[2]  David J. DeWitt,et al.  Read-optimized databases, in depth , 2008, Proc. VLDB Endow..

[3]  Frederick Reiss,et al.  Constant-Time Query Processing , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[4]  David J. DeWitt,et al.  Weaving Relations for Cache Performance , 2001, VLDB.

[5]  Alexander Zeier,et al.  Optimizing Write Performance for Read Optimized Databases , 2010, DASFAA.

[6]  Balakrishna R. Iyer,et al.  Online reorganization of databases , 2009, CSUR.

[7]  Vijayshankar Raman,et al.  Bringing BLINK Closer to the Full Power of SQL , 2009, BTW.

[8]  Jignesh M. Patel,et al.  Data Morphing: An Adaptive, Cache-Conscious Storage Technique , 2003, VLDB.

[9]  Daniel J. Abadi,et al.  Column-stores vs. row-stores: how different are they really? , 2008, SIGMOD Conference.

[10]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[11]  Ryan Johnson,et al.  Row-wise parallel predicate evaluation , 2008, Proc. VLDB Endow..

[12]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[13]  Marcin Zukowski,et al.  MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[14]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[15]  Vojin G. Oklobdzija The Computer Engineering Handbook , 2007 .