Collaborative Compaction Optimization System using Near-Data Processing for LSM-tree-based Key-Value Stores

Abstract Log-structured merge tree (LSM-tree) based key–value stores are widely employed in large-scale storage systems. In compaction, high-level sorted string table files (i.e., SSTables) are merged with low-level overlapping key ranges and sorted for data queries. However, the compaction process incurs write amplification, which degrades system performance particularly under update-intensive workloads. Current optimizations mostly focus on reducing the overload of compaction in the host but rarely exploit parallelisms between the host and the device to fully utilize computing resources in the entire system for optimizing compaction performance. In this study, we propose Co-KV, a Collaborative Key-Value store, to improve compaction performance in LSM-tree-based key–value stores. Co-KV is based on a near-data processing (i.e., NDP) model-enabled storage device. Co-KV exhibits the following advantages: (1) it reduces write amplification and host-side CPU costs using a compaction offloading scheduling between a host computer and an NDP-enabled storage device; (2) it relieves the overload associated with data transfer between the host and the storage device; and (3) it improves the compaction of LSM-tree based key–value stores under update-intensive workloads. We employed an Open Ethernet Driver (OED), which is a real-world NDP platform as the testbed for our experiments. Extensive db_bench evaluations demonstrate that Co-KV achieves overall throughput improvements of approximately 1.75x, CPU cost reductions of approximately 68.1%, and write amplification reductions by up to 50.0% over the state-of-the-art LevelDB. Under YCSB workloads, Co-KV increases throughput by 1.7x ∼ 1.9x while decreasing the write amplification and average latency by up to 50.0% and 46.3%, respectively.

[1]  Chanik Park,et al.  Active disk meets flash: a case for intelligent SSDs , 2013, ICS '13.

[2]  Christos Faloutsos,et al.  Active Storage for Large-Scale Data Mining and Multimedia , 1998, VLDB.

[3]  Kiyoung Choi,et al.  A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[4]  Erez Zadok,et al.  Building workload-independent storage with VT-trees , 2013, FAST.

[5]  Wei Liu,et al.  Co-KV: A Collaborative Key-Value Store Using Near-Data Processing to Improve Compaction for the LSM-tree , 2018, ArXiv.

[6]  Andrea C. Arpaci-Dusseau,et al.  WiscKey: Separating Keys from Values in SSD-conscious Storage , 2016, FAST.

[7]  Jing Li,et al.  Challenges and Opportunities: From Near-memory Computing to In-memory Computing , 2017, ISPD.

[8]  Ittai Abraham,et al.  PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees , 2017, SOSP.

[9]  David A. Patterson,et al.  A case for intelligent disks (IDISKs) , 1998, SGMD.

[10]  Rick Cattell,et al.  Scalable SQL and NoSQL data stores , 2011, SGMD.

[11]  Seung Ryoul Maeng,et al.  FTL design exploration in reconfigurable high-performance SSD for server applications , 2009, ICS.

[12]  Idit Keidar,et al.  Scaling concurrent log-structured data stores , 2015, EuroSys.

[13]  Sally A. McKee,et al.  Hitting the memory wall: implications of the obvious , 1995, CARN.

[14]  Neal Leavitt,et al.  Will NoSQL Databases Live Up to Their Promise? , 2010, Computer.

[15]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[16]  Yinliang Yue,et al.  Pipelined Compaction for the LSM-Tree , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.

[17]  Steven Swanson,et al.  The bleak future of NAND flash memory , 2012, FAST.

[18]  Sungjin Lee,et al.  BlueDBM: An appliance for Big Data analytics , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[19]  Maya Gokhale,et al.  Processing in Memory: The Terasys Massively Parallel PIM Array , 1995, Computer.

[20]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[21]  Chanik Park,et al.  Enabling cost-effective data processing with smart SSD , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[22]  Guan Le,et al.  Survey on NoSQL database , 2011, 2011 6th International Conference on Pervasive Computing and Applications.

[23]  Jin Xiong,et al.  dCompaction: Delayed Compaction for the LSM-Tree , 2017, International Journal of Parallel Programming.

[24]  Jeong-Don Ihm,et al.  256 Gb 3 b/Cell V-nand Flash Memory With 48 Stacked WL Layers , 2017, IEEE Journal of Solid-State Circuits.

[25]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[26]  Bingsheng He,et al.  Tree indexing on solid state drives , 2010, Proc. VLDB Endow..

[27]  Jason Cong,et al.  Atlas: Baidu's key-value storage system for cloud data , 2015, 2015 31st Symposium on Mass Storage Systems and Technologies (MSST).

[28]  Sangyeun Cho,et al.  The solid-state drive technology, today and tomorrow , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[29]  Jianguo Wang,et al.  In-Storage Computing for Hadoop MapReduce Framework: Challenges and Possibilities , 2016 .

[30]  Xian-He Sun,et al.  Towards Energy Efficient Data Management in HPC: The Open Ethernet Drive Approach , 2016, 2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS).

[31]  David J. DeWitt,et al.  Query processing on smart SSDs: opportunities and challenges , 2013, SIGMOD '13.

[32]  Jin-Soo Kim,et al.  ForestDB: A Fast Key-Value Storage System for Variable-Length String Keys , 2016, IEEE Transactions on Computers.

[33]  H. Stuckenberg NUCLEAR DATA PROCESSING. , 1969 .

[34]  Sungjin Lee,et al.  Refactored Design of I/O Architecture for Flash Storage , 2015, IEEE Computer Architecture Letters.

[35]  Jinyoung Lee,et al.  Biscuit: A Framework for Near-Data Processing of Big Data Workloads , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[36]  Ling Liu,et al.  Computing infrastructure for big data processing , 2013, Frontiers of Computer Science.

[37]  C. L. Philip Chen,et al.  Data-intensive applications, challenges, techniques and technologies: A survey on Big Data , 2014, Inf. Sci..

[38]  Mrinmoy Ghosh,et al.  Performance analysis of NVMe SSDs and their implication on real world databases , 2015, SYSTOR.

[39]  Raghu Ramakrishnan,et al.  bLSM: a general purpose log structured merge tree , 2012, SIGMOD Conference.

[40]  Nong Xiao,et al.  Gemini: A Novel Hardware and Software Implementation of High-performance PCIe SSD , 2016, International Journal of Parallel Programming.

[41]  Jason Cong,et al.  An efficient design and implementation of LSM-tree based key-value store on open-channel SSD , 2014, EuroSys '14.

[42]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[43]  Steven Swanson,et al.  Near-Data Processing: Insights from a MICRO-46 Workshop , 2014, IEEE Micro.

[44]  Peter Desnoyers,et al.  Empirical evaluation of NAND flash memory performance , 2010, OPSR.

[45]  Bingsheng He,et al.  Building an Efficient Put-Intensive Key-Value Store with Skip-Tree , 2017, IEEE Transactions on Parallel and Distributed Systems.