TridentKV: A Read-Optimized LSM-tree Based KV Store via Adaptive Indexing and Space-Efficient Partitioning

LSM-tree based key-value (KV) stores suffer severe read performance loss due to the leveled structure of the LSM-tree. Especially, when modern storage devices with high bandwidth and low latency are used, the read performance of KV store is seriously affected by inefficient file indexing. Besides, due to the deletion pattern of inserting tombstones, the KV stores based on LSM-tree are faced with the problem of read performance fluctuations that are caused by large-scale data deletion (also referred to as the Read-After-Delete problem). In this article, TridentKV is proposed to improve the read performance of KV stores. An adaptive learned index structure is first designed to speed up file indexing. Also, a space-efficient partition strategy is proposed to solve the Read-After-Delete problem. Besides, asynchronous reading design is adopted, and SPDK is supported for high concurrency and low latency. TridentKV is implemented on RocksDB and the evaluation results indicate that compared with RocksDB, the read performance of TridentKV is improved by 7× to 12× without loss of write performance and TridentKV provides stable read performance even if a large number of deletions or migrations occur. Instead of RocksDB, TridentKV is exploited to store metadata in Ceph, which improves the read performance of Ceph by 20%<inline-formula><tex-math notation="LaTeX">$\sim$</tex-math><alternatives><mml:math><mml:mo>∼</mml:mo></mml:math><inline-graphic xlink:href="wan-ieq1-3118599.gif"/></alternatives></inline-formula>60%.