A Storage Advisor for Hybrid-Store Databases

With the SAP HANA database, SAP offers a high-performance in-memory hybrid-store database. Hybrid-store databases---that is, databases supporting row- and column-oriented data management---are getting more and more prominent. While the columnar management offers high-performance capabilities for analyzing large quantities of data, the row-oriented store can handle transactional point queries as well as inserts and updates more efficiently. To effectively take advantage of both stores at the same time the novel question whether to store the given data row- or column-oriented arises. We tackle this problem with a storage advisor tool that supports database administrators at this decision. Our proposed storage advisor recommends the optimal store based on data and query characteristics; its core is a cost model to estimate and compare query execution times for the different stores. Besides a per-table decision, our tool also considers to horizontally and vertically partition the data and manage the partitions on different stores. We evaluated the storage advisor for the use in the SAP HANA database; we show the recommendation quality as well as the benefit of having the data in the optimal store with respect to increased query performance.

[1]  Martin L. Kersten,et al.  Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.

[2]  Inderpal Singh Mumick,et al.  Selection of views to materialize in a data warehouse , 1997, IEEE Transactions on Knowledge and Data Engineering.

[3]  Surajit Chaudhuri,et al.  Database tuning advisor for microsoft SQL server 2005: demo , 2005, SIGMOD '05.

[4]  Surajit Chaudhuri,et al.  An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server , 1997, VLDB.

[5]  Hasso Plattner,et al.  A common database approach for OLTP and OLAP using an in-memory column database , 2009, SIGMOD Conference.

[6]  Vivek R. Narasayya,et al.  Integrating vertical and horizontal partitioning into automated physical database design , 2004, SIGMOD '04.

[7]  David J. DeWitt,et al.  Weaving Relations for Cache Performance , 2001, VLDB.

[8]  Alexander Zeier,et al.  Speeding Up Queries in Column Stores - A Case for Compression , 2010, DaWak.

[9]  Zhiwei Xu,et al.  RCFile: A fast and space-efficient data placement structure in MapReduce-based warehouse systems , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[10]  Wolfgang Lehner,et al.  SAP HANA database: data management for modern business applications , 2012, SGMD.

[11]  Frederick Reiss,et al.  Constant-Time Query Processing , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[12]  Marcin Zukowski,et al.  MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[13]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[14]  Alexander Zeier,et al.  HYRISE - A Main Memory Hybrid Storage Engine , 2010, Proc. VLDB Endow..

[15]  Sam Lightstone,et al.  DB2 Design Advisor: Integrated Automatic Physical Database Design , 2004, VLDB.