ElasticBF: Fine-grained and Elastic Bloom Filter Towards Efficient Read for LSM-tree-based KV Stores

LSM-tree based KV stores suffer from severe read amplification, especially for large KV stores. Even worse, many applications may issue a large amount of lookup operations to search for nonexistent keys, which wastes a lot of extra I/Os. Even though Bloom filters can be used to speedup the read performance, existing designs usually adopt a uniform setting for all Bloom filters and fail to support dynamic adjustment, thus results in a high false positive rate or large memory consumption. To address this issue, we propose ElasticBF, which constructs more small filters for each SSTable and dynamically load into memory as needed based on access frequency, so it realizes a fine-grained and elastic adjustment in running time with the same memory usage. Experiment shows that ElasticBF can achieve 1.94×-2.24× read throughput compared to LevelDB under different workloads, and preserves the same write performance. More importantly, ElasticBF is orthogonal to existing works optimizing the structure of KV stores, so it can be used as an accelerator to further speedup their read performance.

[1]  Andrea C. Arpaci-Dusseau,et al.  WiscKey: Separating Keys from Values in SSD-conscious Storage , 2016, FAST.

[2]  Rachid Guerraoui,et al.  FloDB: Unlocking Memory in Persistent Key-Value Stores , 2017, EuroSys.

[3]  Yuanyuan Zhou,et al.  The Multi-Queue Replacement Algorithm for Second Level Buffer Caches , 2001, USENIX Annual Technical Conference, General Track.

[4]  Michael Mitzenmacher,et al.  Less Hashing, Same Performance: Building a Better Bloom Filter , 2006, ESA.

[5]  Hyeontaek Lim,et al.  Towards Accurate and Fast Evaluation of Multi-Stage Log-structured Designs , 2016, FAST.

[6]  Manos Athanassoulis,et al.  Monkey: Optimal Navigable Key-Value Store , 2017, SIGMOD Conference.

[7]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[8]  Gang Chen,et al.  LogBase: A Scalable Log-structured Database System in the Cloud , 2012, Proc. VLDB Endow..

[9]  Rachid Guerraoui,et al.  TRIAD: Creating Synergies Between Memory, Disk and Log in Log Structured Key-Value Stores , 2017, USENIX Annual Technical Conference.

[10]  Song Jiang,et al.  LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data Items , 2015, USENIX Annual Technical Conference.

[11]  Heng Zhang,et al.  Efficient and Available In-Memory KV-Store with Hybrid Erasure Coding and Replication , 2016, FAST.

[12]  Guillaume Pierre,et al.  Wikipedia workload analysis for decentralized hosting , 2009, Comput. Networks.

[13]  Ricardo Bianchini,et al.  Page placement in hybrid memory systems , 2011, ICS '11.

[14]  Ethan L. Miller,et al.  Realistic request arrival generation in storage benchmarks , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[15]  Pilar González-Férez,et al.  Tucana: Design and Implementation of a Fast and Efficient Scale-up Key-value Store , 2016, USENIX ATC.

[16]  Andrea C. Arpaci-Dusseau,et al.  Analysis of HDFS under HBase: a facebook messages case study , 2014, FAST.

[17]  Ittai Abraham,et al.  PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees , 2017, SOSP.

[18]  Raghu Ramakrishnan,et al.  bLSM: a general purpose log structured merge tree , 2012, SIGMOD Conference.

[19]  Ethan L. Miller,et al.  Muninn: a Versioning Flash Key-Value Store Using an Object-based Storage Model , 2014, SYSTOR 2014.

[20]  Jason Cong,et al.  Atlas: Baidu's key-value storage system for cloud data , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).