SkimpyStash: RAM space skimpy key-value store on flash-based storage

We present SkimpyStash, a RAM space skimpy key-value store on flash-based storage, designed for high throughput, low latency server applications. The distinguishing feature of SkimpyStash is the design goal of extremely low RAM footprint at about 1 (± 0.5) byte per key-value pair, which is more aggressive than earlier designs. SkimpyStash uses a hash table directory in RAM to index key-value pairs stored in a log-structured manner on flash. To break the barrier of a flash pointer (say, 4 bytes) worth of RAM overhead per key, it "moves" most of the pointers that locate each key-value pair from RAM to flash itself. This is realized by (i) resolving hash table collisions using linear chaining, where multiple keys that resolve (collide) to the same hash table bucket are chained in a linked list, and (ii) storing the linked lists on flash itself with a pointer in each hash table bucket in RAM pointing to the beginning record of the chain on flash, hence incurring multiple flash reads per lookup. Two further techniques are used to improve performance: (iii) two-choice based load balancing to reduce wide variation in bucket sizes (hence, chain lengths and associated lookup times), and a bloom filter in each hash table directory slot in RAM to disambiguate the choice during lookup, and (iv) compaction procedure to pack bucket chain records contiguously onto flash pages so as to reduce flash reads during lookup. The average bucket size is the critical design parameter that serves as a powerful knob for making a continuum of tradeoffs between low RAM usage and low lookup latencies. Our evaluations on commodity server platforms with real-world data center applications show that SkimpyStash provides throughputs from few 10,000s to upwards of 100,000 get-set operations/sec.

[1]  Mendel Rosenblum,et al.  The design and implementation of a log-structured file system , 1991, SOSP '91.

[2]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[3]  Suman Nath,et al.  Cheap and Large CAMs for High Performance Data-Intensive Networked Systems , 2010, NSDI.

[4]  Suman Nath,et al.  FlashDB: Dynamic Self-tuning Database for NAND Flash , 2007, 2007 6th International Symposium on Information Processing in Sensor Networks.

[5]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[6]  D. Andersen,et al.  A Fast Array of Wimpy Nodes , 2008 .

[7]  Jin Li,et al.  ChunkStash: Speeding Up Inline Storage Deduplication Using Flash Memory , 2010, USENIX Annual Technical Conference.

[8]  Bingsheng He,et al.  Tree indexing on solid state drives , 2010, Proc. VLDB Endow..

[9]  Suman Nath,et al.  Online maintenance of very large random samples on flash storage , 2008, The VLDB Journal.

[10]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[11]  Stratis Viglas,et al.  Flashing up the storage layer , 2008, Proc. VLDB Endow..

[12]  Shimin Chen,et al.  FlashLogging: exploiting flash devices for synchronous logging performance , 2009, SIGMOD Conference.

[13]  Ramesh K. Sitaraman,et al.  Lazy-Adaptive Tree: An Optimized Index Structure for Flash Devices , 2009, Proc. VLDB Endow..

[14]  Eli Upfal,et al.  Balanced Allocations , 1999, SIAM J. Comput..

[15]  Jin Li,et al.  FlashStore , 2010, Proc. VLDB Endow..

[16]  Hiroshi Motoda,et al.  A Flash-Memory Based File System , 1995, USENIX.

[17]  Youngjae Kim,et al.  DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings , 2009, ASPLOS.

[18]  Kai Li,et al.  Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[19]  Dimitrios Gunopulos,et al.  Microhash: an efficient index structure for fash-based sensor devices , 2005, FAST'05.

[20]  Sivan Toledo,et al.  Algorithms and data structures for flash memories , 2005, CSUR.

[21]  Eli Upfal,et al.  Balanced allocations (extended abstract) , 1994, STOC '94.

[22]  Quynh H. Dang,et al.  Secure Hash Standard | NIST , 2015 .

[23]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.