Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes

Using indexes for query execution is crucial for achieving high performance in modern on-line transaction processing databases. For a main-memory database, however, these indexes consume a large fraction of the total memory available and are thus a major source of storage overhead of in-memory databases. To reduce this overhead, we propose using a two-stage index: The first stage ingests all incoming entries and is kept small for fast read and write operations. The index periodically migrates entries from the first stage to the second, which uses a more compact, read-optimized data structure. Our first contribution is hybrid index, a dual-stage index architecture that achieves both space efficiency and high performance. Our second contribution is Dual-Stage Transformation (DST), a set of guidelines for converting any order-preserving index structure into a hybrid index. Our third contribution is applying DST to four popular order-preserving index structures and evaluating them in both standalone microbenchmarks and a full in-memory DBMS using several transaction processing workloads. Our results show that hybrid indexes provide comparable throughput to the original ones while reducing the memory overhead by up to 70%.

[1]  William Pugh,et al.  Skip Lists: A Probabilistic Alternative to Balanced Trees , 1989, WADS.

[2]  Yannis E. Ioannidis,et al.  Bitmap index design and evaluation , 1998, SIGMOD '98.

[3]  Bin Fan,et al.  SILT: a memory-efficient, high-performance key-value store , 2011, SOSP.

[4]  Andrew Chi-Chih Yao,et al.  On random 2–3 trees , 1978, Acta Informatica.

[5]  Michael Stonebraker,et al.  The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[6]  Viktor Leis,et al.  The adaptive radix tree: ARTful indexing for main-memory databases , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[7]  Per-Åke Larson,et al.  SQL server column store indexes , 2011, SIGMOD '11.

[8]  Martin L. Kersten,et al.  Database Cracking , 2007, CIDR.

[9]  Radu Stoica,et al.  Identifying hot and cold data in main-memory databases , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[10]  Gerhard Weikum,et al.  The LHAM log-structured history data access method , 2000, The VLDB Journal.

[11]  Eddie Kohler,et al.  Modular data storage with Anvil , 2009, SOSP '09.

[12]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[13]  Wolfgang Lehner,et al.  SAP HANA database: data management for modern business applications , 2012, SGMD.

[14]  Karthick Rajamani,et al.  Energy Management for Commercial Servers , 2003, Computer.

[15]  Guy Joseph Jacobson,et al.  Succinct static data structures , 1988 .

[16]  Michael Stonebraker,et al.  Anti-Caching: A New Approach to Database Management System Architecture , 2013, Proc. VLDB Endow..

[17]  Eddie Kohler,et al.  Cache craftiness for fast multicore key-value storage , 2012, EuroSys '12.

[18]  Luping Ding,et al.  Dynamic Materialized Views , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[19]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[20]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[21]  Pradeep Dubey,et al.  Fast Updates on Read-Optimized Databases Using Multi-Core CPUs , 2011, Proc. VLDB Endow..

[22]  Sudipta Sengupta,et al.  Indexing on modern hardware: hekaton and beyond , 2014, SIGMOD Conference.

[23]  Wolfgang Lehner,et al.  Efficient transaction processing in SAP HANA database: the end of a column store myth , 2012, SIGMOD Conference.

[24]  Craig Freedman,et al.  Hekaton: SQL server's memory-optimized OLTP engine , 2013, SIGMOD '13.

[25]  Alfons Kemper,et al.  Compacting Transactional Data in Hybrid OLTP & OLAP Databases , 2012, Proc. VLDB Endow..

[26]  Ahmed Eldawy,et al.  Trekking Through Siberia: Managing Cold Data in a Memory-Optimized Database , 2014, Proc. VLDB Endow..

[27]  Alekh Jindal,et al.  Towards a One Size Fits All Database Architecture , 2011, CIDR.

[28]  J. Ian Munro,et al.  Deterministic skip lists , 1992, SODA '92.

[29]  Raphaël Clifford,et al.  ACM-SIAM Symposium on Discrete Algorithms , 2015, SODA 2015.

[30]  Jon Louis Bentley,et al.  Decomposable Searching Problems I: Static-to-Dynamic Transformation , 1980, J. Algorithms.

[31]  Michael Stonebraker,et al.  H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..

[32]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[33]  Anastasia Ailamaki,et al.  Enabling efficient OS paging for main-memory OLTP databases , 2013, DaMoN '13.

[34]  Michael Stonebraker,et al.  OLTP through the looking glass, and what we found there , 2008, SIGMOD Conference.

[35]  Sven Helmer,et al.  The implementation and performance of compressed databases , 2000, SGMD.

[36]  Eddie Kohler,et al.  Speedy transactions in multicore in-memory databases , 2013, SOSP.