VAT: Asymptotic Cost Analysis for Multi-Level Key-Value Stores

Over the past years, there has been an increasing number of key-value (KV) store designs, each optimizing for a different set of requirements. Furthermore, with the advancements of storage technology the design space of KV stores has become even more complex. More recent KV-store designs target fast storage devices, such as SSDs and NVM. Most of these designs aim to reduce amplification during data reorganization by taking advantage of device characteristics. However, until today most analysis of KV-store designs is experimental and limited to specific design points. This makes it difficult to compare tradeoffs across different designs, find optimal configurations and guide future KV-store design. In this paper, we introduce the Variable Amplification- Throughput analysis (VAT) to calculate insert-path amplification and its impact on multi-level KV-store performance.We use VAT to express the behavior of several existing design points and to explore tradeoffs that are not possible or easy to measure experimentally. VAT indicates that by inserting randomness in the insert-path, KV stores can reduce amplification by more than 10x for fast storage devices. Techniques, such as key-value separation and tiering compaction, reduce amplification by 10x and 5x, respectively. Additionally, VAT predicts that the advancements in device technology towards NVM, reduces the benefits from both using key-value separation and tiering.

[1]  Margo I. Seltzer,et al.  Heuristic Cleaning Algorithms in Log-Structured File Systems , 1995, USENIX.

[2]  Jason Cong,et al.  Atlas: Baidu's key-value storage system for cloud data , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[3]  Andrea C. Arpaci-Dusseau,et al.  Redesigning LSMs for Nonvolatile Memory with NoveLSM , 2018, USENIX Annual Technical Conference.

[4]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[5]  Raghu Ramakrishnan,et al.  bLSM: a general purpose log structured merge tree , 2012, SIGMOD Conference.

[6]  Hong Jiang,et al.  SifrDB: A Unified Solution for Write-Optimized Key-Value Stores in Large Datacenter , 2018, SoCC.

[7]  Idit Keidar,et al.  Scaling concurrent log-structured data stores , 2015, EuroSys.

[8]  Yongkun Li,et al.  Enabling Efficient Updates in KV Storage via Hashing , 2018, USENIX Annual Technical Conference.

[9]  Erez Zadok,et al.  Building workload-independent storage with VT-trees , 2013, FAST.

[10]  Pilar González-Férez,et al.  An Efficient Memory-Mapped Key-Value Store for Flash Storage , 2018, SoCC.

[11]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[12]  Michael A. Bender,et al.  An Introduction to Bε-trees and Write-Optimization , 2015, login Usenix Mag..

[13]  Ada Gavrilovska,et al.  Mutant: Balancing Storage Cost and Latency in LSM-Tree Data Stores , 2018, SoCC.

[14]  S. Sudarshan,et al.  Incremental Organization for Data Recording and Warehousing , 1997, VLDB.

[15]  Ittai Abraham,et al.  PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees , 2017, SOSP.

[16]  Pilar González-Férez,et al.  Tucana: Design and Implementation of a Fast and Efficient Scale-up Key-value Store , 2016, USENIX ATC.

[17]  Yanpei Chen,et al.  Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads , 2012, Proc. VLDB Endow..

[18]  Avraham Adler,et al.  Lambert-W Function , 2015 .

[19]  Xiao Liu,et al.  Basic Performance Measurements of the Intel Optane DC Persistent Memory Module , 2019, ArXiv.

[20]  Song Jiang,et al.  LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Items , 2015, USENIX Annual Technical Conference.

[21]  Hyeontaek Lim,et al.  Towards Accurate and Fast Evaluation of Multi-Stage Log-structured Designs , 2016, FAST.

[22]  Manos Athanassoulis,et al.  Monkey: Optimal Navigable Key-Value Store , 2017, SIGMOD Conference.

[23]  Jason Cong,et al.  An efficient design and implementation of LSM-tree based key-value store on open-channel SSD , 2014, EuroSys '14.

[24]  Andrea C. Arpaci-Dusseau,et al.  WiscKey: Separating Keys from Values in SSD-conscious Storage , 2016, FAST.

[25]  Jian Xu,et al.  NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories , 2016, FAST.

[26]  Rachid Guerraoui,et al.  TRIAD: Creating Synergies Between Memory, Disk and Log in Log Structured Key-Value Stores , 2017, USENIX Annual Technical Conference.

[27]  Stratos Idreos,et al.  The Log-Structured Merge-Bush & the Wacky Continuum , 2019, SIGMOD Conference.

[28]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[29]  Stratos Idreos,et al.  Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging , 2018, SIGMOD Conference.