SlimDB: A Space-Efficient Key-Value Storage Engine For Semi-Sorted Data

Modern key-value stores often use write-optimized indexes and compact in-memory indexes to speed up read and write performance. One popular write-optimized index is the Log-structured merge-tree (LSM-tree) which provides indexed access to write-intensive data. It has been increasingly used as a storage backbone for many services, including file system metadata management, graph processing engines, and machine learning feature storage engines. Existing LSM-tree implementations often exhibit high write amplifications caused by compaction, and lack optimizations to maximize read performance on solid-state disks. The goal of this paper is to explore techniques that leverage common workload characteristics shared by many systems using key-value stores to reduce the read/write amplification overhead typically associated with general-purpose LSM-tree implementations. Our experiments show that by applying these design techniques, our new implementation of a key-value store, SlimDB, can be two to three times faster, use less memory to cache metadata indices, and show lower tail latency in read operations compared to popular LSM-tree implementations such as LevelDB and RocksDB.

[1]  Jin Li,et al.  SkimpyStash: RAM space skimpy key-value store on flash-based storage , 2011, SIGMOD '11.

[2]  Guy Joseph Jacobson,et al.  Succinct static data structures , 1988 .

[3]  Bingsheng He,et al.  Tree indexing on solid state drives , 2010, Proc. VLDB Endow..

[4]  Joaquin Quiñonero Candela,et al.  Practical Lessons from Predicting Clicks on Ads at Facebook , 2014, ADKDD'14.

[5]  Kai Ren,et al.  TABLEFS: Enhancing Metadata Efficiency in the Local File System , 2013, USENIX Annual Technical Conference.

[6]  Dong Wang,et al.  Click-through Prediction for Advertising in Twitter Timeline , 2015, KDD.

[7]  Hyeontaek Lim,et al.  Towards Accurate and Fast Evaluation of Multi-Stage Log-structured Designs , 2016, FAST.

[8]  Manos Athanassoulis,et al.  Monkey: Optimal Navigable Key-Value Store , 2017, SIGMOD Conference.

[9]  Gerth Stølting Brodal,et al.  Lower bounds for external memory dictionaries , 2003, SODA '03.

[10]  Chris Douglas,et al.  Walnut: a unified cloud object store , 2012, SIGMOD Conference.

[11]  Josef Bacik,et al.  BTRFS: The Linux B-Tree Filesystem , 2013, TOS.

[12]  Jin Li,et al.  FlashStore , 2010, Proc. VLDB Endow..

[13]  Timothy G. Armstrong,et al.  LinkBench: a database benchmark based on the Facebook social graph , 2013, SIGMOD '13.

[14]  Martin Dietzfelbinger,et al.  Hash, Displace, and Compress , 2009, ESA.

[15]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[16]  Bin Fan,et al.  Cuckoo Filter: Practically Better Than Bloom , 2014, CoNEXT.

[17]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[18]  Bin Fan,et al.  SILT: a memory-efficient, high-performance key-value store , 2011, SOSP.

[19]  Song Jiang,et al.  LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Items , 2015, USENIX Annual Technical Conference.

[20]  Dong Zhou,et al.  When Cycles Are Cheap, Some Tables Can Be Huge , 2013, HotOS.

[21]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[22]  Michael A. Bender,et al.  Cache-oblivious streaming B-trees , 2007, SPAA '07.

[23]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[24]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[25]  Andrea C. Arpaci-Dusseau,et al.  WiscKey: Separating Keys from Values in SSD-conscious Storage , 2016, FAST.

[26]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[27]  Erez Zadok,et al.  Building workload-independent storage with VT-trees , 2013, FAST.

[28]  Konstantinos Panagiotou,et al.  The multiple-orientability thresholds for random hypergraphs , 2011, SODA '11.

[29]  S. Sudarshan,et al.  Incremental Organization for Data Recording and Warehousing , 1997, VLDB.

[30]  Raghu Ramakrishnan,et al.  bLSM: a general purpose log structured merge tree , 2012, SIGMOD Conference.

[31]  Josef Bacik,et al.  IBM Research Report BTRFS: The Linux B-tree Filesystem , 2012 .

[32]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.