LDC: A Lower-Level Driven Compaction Method to Optimize SSD-Oriented Key-Value Stores

Log-structured merge (LSM) tree key-value (KV) stores have been widely deployed in many NoSQL and SQL systems, serving online big data applications such as social networking, bioinfomatics, graph processing, machine learning, etc. The batch processing of sorted data merging (i.e., compaction) in LSM-tree KV stores greatly improves the efficiency of writing, leading to good write performance and high space efficiency. Recently, some lazy compaction methods were proposed to further promote the system throughput through delaying the compaction to accumulate more data within a compaction batch. However, the batched writing manner also leads to significant tail latency, which is unacceptable for online processing, and the newly proposed lazy approaches worsen the tail latency problem. Furthermore, the unbalanced read/write performance of the widely deployed SSDs make the performance optimization harder. Aiming to optimize both the tail latency and the system throughput, in this paper, we propose a novel Lower-level Driven Compaction (LDC) method for LSM-tree KV stores. LDC breaks the limitations of the traditional upper-level driven compaction manner and triggers practical compaction actions by lower-level data. It has the benefits of both decreasing the compaction granularity effectively for smaller tail latency and reducing the write amplification of LSM-tree compaction for higher throughput. We have implemented LDC in LevelDB; the experimental results indicate that LDC can reduce the 99.9th percentile latency for 2.62 times compared with the traditional upper-level driven compaction mechanism, and achieve 56.7% ~ 72.3% higher system throughput at the same time.

[1]  Steven Swanson,et al.  The bleak future of NAND flash memory , 2012, FAST.

[2]  Yinliang Yue,et al.  Pipelined Compaction for the LSM-Tree , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[3]  Andria Arisal,et al.  Processing next generation sequencing data in map-reduce framework using hadoop-BAM in a computer cluster , 2017, 2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE).

[4]  Song Jiang,et al.  LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Items , 2015, USENIX Annual Technical Conference.

[5]  Goetz Graefe,et al.  Sorting And Indexing With Partitioned B-Trees , 2003, CIDR.

[6]  Ittai Abraham,et al.  PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees , 2017, SOSP.

[7]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[8]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[9]  Jin Li,et al.  FlashStore , 2010, Proc. VLDB Endow..

[10]  Peter Desnoyers,et al.  Write Endurance in Flash Drives: Measurements and Analysis , 2010, FAST.

[11]  Carlos E. Cuesta,et al.  Using a NoSQL Graph Oriented Database to Store Accessible Transport Routes , 2018, EDBT/ICDT Workshops.

[12]  C. Xie,et al.  A Light-weight Compaction Tree to Reduce I / O Amplification toward Efficient Key-Value Stores , 2017 .

[13]  Jason Cong,et al.  Atlas: Baidu's key-value storage system for cloud data , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[14]  Jason Cong,et al.  An efficient design and implementation of LSM-tree based key-value store on open-channel SSD , 2014, EuroSys '14.

[15]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[16]  Andrea C. Arpaci-Dusseau,et al.  WiscKey: Separating Keys from Values in SSD-conscious Storage , 2016, FAST.

[17]  Tony Savor,et al.  Optimizing Space Amplification in RocksDB , 2017, CIDR.

[18]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[19]  Abhishek Kumar,et al.  Evaluation of MapReduce-Based Distributed Parallel Machine Learning Algorithms , 2018 .

[20]  Manos Athanassoulis,et al.  Monkey: Optimal Navigable Key-Value Store , 2017, SIGMOD Conference.

[21]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[22]  Jin Xiong,et al.  dCompaction: Speeding up Compaction of the LSM-Tree via Delayed Compaction , 2017, Journal of Computer Science and Technology.