Bridging the Gap Between Theory and Practice on Insertion-Intensive Database

With the prevalence of online platforms, today, data is being generated and accessed by users at a very high rate. Besides, applications such as stock trading or high frequency trading require guaranteed low delays for performing an operation on a database. It is consequential to design databases that guarantee data insertion and query at a consistently high rate without introducing any long delay during insertion. In this paper, we propose Nested B-trees (NB-trees), an index that can achieve a consistently high insertion rate on large volumes of data, while providing asymptotically optimal query performance that is very efficient in practice. Nested B-trees support insertions at rates higher than LSM-trees, the state-of-the-art index for insertion-intensive workloads, while avoiding their long insertion delays and improving on their query performance. They approach the query performance of B-trees when complemented with Bloom filters. In our experiments, NB-trees had worst-case delays up to 1000 smaller than LevelDB, RocksDB and bLSM, commonly used LSM-tree data-stores, could perform queries more than 4 times faster than LevelDB and 1.5 times faster than bLSM and RocksDB, while also outperforming them in terms of average insertion rate.

[1]  Chen Li,et al.  Storage Management in AsterixDB , 2014, Proc. VLDB Endow..

[2]  Erez Zadok,et al.  Building workload-independent storage with VT-trees , 2013, FAST.

[3]  Cheonsoo Kim,et al.  Like, comment, and share on Facebook: How each behavior differs from the other , 2017 .

[4]  Chris Jermaine,et al.  The partitioned exponential file for database storage management , 2007, The VLDB Journal.

[5]  Chong Feng,et al.  An Improved LSM-Tree Index for NoSQL Data-Store , 2017 .

[6]  Badrish Chandramouli,et al.  FASTER: A Concurrent Key-Value Store with In-Place Updates , 2018, SIGMOD Conference.

[7]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[8]  Chris Douglas,et al.  Walnut: a unified cloud object store , 2012, SIGMOD Conference.

[9]  Jason Cong,et al.  An efficient design and implementation of LSM-tree based key-value store on open-channel SSD , 2014, EuroSys '14.

[10]  Hyeontaek Lim,et al.  Towards Accurate and Fast Evaluation of Multi-Stage Log-structured Designs , 2016, FAST.

[11]  Manos Athanassoulis,et al.  Monkey: Optimal Navigable Key-Value Store , 2017, SIGMOD Conference.

[12]  Cynthia A. Phillips,et al.  Write-Optimized Skip Lists , 2017, PODS.

[13]  Bingsheng He,et al.  Tree indexing on solid state drives , 2010, Proc. VLDB Endow..

[14]  Bingsheng He,et al.  Building an Efficient Put-Intensive Key-Value Store with Skip-Tree , 2017, IEEE Transactions on Parallel and Distributed Systems.

[15]  Song Jiang,et al.  LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Items , 2015, USENIX Annual Technical Conference.

[16]  Stratos Idreos,et al.  The Log-Structured Merge-Bush & the Wacky Continuum , 2019, SIGMOD Conference.

[17]  Rudolf Bayer,et al.  Organization and maintenance of large ordered indexes , 1972, Acta Informatica.

[18]  Abraham Silberschatz,et al.  Database System Concepts , 1980 .

[19]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[20]  Andrea C. Arpaci-Dusseau,et al.  WiscKey: Separating Keys from Values in SSD-conscious Storage , 2016, FAST.

[21]  Sudipta Sengupta,et al.  The Bw-Tree: A B-tree for new hardware platforms , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[22]  Kenneth A. Ross,et al.  Making B+- trees cache conscious in main memory , 2000, SIGMOD '00.

[23]  Yongkun Li,et al.  Improving Write Performance of LSMT-Based Key-Value Store , 2016, 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS).

[24]  John Iacono,et al.  Using hashing to solve the dictionary problem , 2012, SODA.

[25]  Chris Jermaine,et al.  A Novel Index Supporting High Volume Data Warehouse Insertion , 1999, VLDB.

[26]  Eddie Kohler,et al.  Cache craftiness for fast multicore key-value storage , 2012, EuroSys '12.

[27]  Leonidas J. Guibas,et al.  Fractional cascading: I. A data structuring technique , 1986, Algorithmica.

[28]  Mehul A. Shah,et al.  Analyzing the energy efficiency of a database server , 2010, SIGMOD Conference.

[29]  Raghu Ramakrishnan,et al.  bLSM: a general purpose log structured merge tree , 2012, SIGMOD Conference.

[30]  C. Xie,et al.  A Light-weight Compaction Tree to Reduce I / O Amplification toward Efficient Key-Value Stores , 2017 .

[31]  Rachid Guerraoui,et al.  TRIAD: Creating Synergies Between Memory, Disk and Log in Log Structured Key-Value Stores , 2017, USENIX Annual Technical Conference.

[32]  Gerth Stølting Brodal,et al.  Lower bounds for external memory dictionaries , 2003, SODA '03.

[33]  Jin Xiong,et al.  dCompaction: Delayed Compaction for the LSM-Tree , 2017, International Journal of Parallel Programming.

[34]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[35]  Stratos Idreos,et al.  Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging , 2018, SIGMOD Conference.

[36]  Rafael Fernando Diorio,et al.  Testing an IP-based Multimedia Gateway , 2015 .

[37]  Mikhail Bautin,et al.  Storage Infrastructure Behind Facebook Messages: Using HBase at Scale , 2012, IEEE Data Eng. Bull..

[38]  Idit Keidar,et al.  Accordion: Better Memory Organization for LSM Key-Value Stores , 2018, Proc. VLDB Endow..

[39]  Jun Yang,et al.  On Log-Structured Merge for Solid-State Drives , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[40]  Bin Fan,et al.  SILT: a memory-efficient, high-performance key-value store , 2011, SOSP.