SILT: a memory-efficient, high-performance key-value store

SILT (Small Index Large Table) is a memory-efficient, high-performance key-value store system based on flash storage that scales to serve billions of key-value items on a single node. It requires only 0.7 bytes of DRAM per entry and retrieves key/value pairs using on average 1.01 flash reads each. SILT combines new algorithmic and systems techniques to balance the use of memory, storage, and computation. Our contributions include: (1) the design of three basic key-value stores each with a different emphasis on memory-efficiency and write-friendliness; (2) synthesis of the basic key-value stores to build a SILT key-value store system; and (3) an analytical model for tuning system parameters carefully to meet the needs of different workloads. SILT requires one to two orders of magnitude less memory to provide comparable throughput to current high-performance key-value systems on a commodity desktop system with flash storage.

[1]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[2]  Peter Elias,et al.  Universal codeword sets and representations of the integers , 1975, IEEE Trans. Inf. Theory.

[3]  Richard Hull,et al.  Applying approximate order dependency to reduce indexing space , 1982, SIGMOD '82.

[4]  Edward A. Fox,et al.  Practical minimal perfect hash functions for large databases , 1992, CACM.

[5]  Arne Andersson,et al.  Improved Behaviour of Tries by Adaptive Branching , 1993, Inf. Process. Lett..

[6]  Paul R. Wilson,et al.  Dynamic Storage Allocation: A Survey and Critical Review , 1995, IWMM.

[7]  David Richard Clark,et al.  Compact pat trees , 1998 .

[8]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[9]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[10]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[11]  Guy E. Blelloch,et al.  An Experimental Analysis of a Compact Graph Representation , 2004, ALENEX/ANALC.

[12]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[13]  Dimitrios Gunopulos,et al.  Microhash: an efficient index structure for fash-based sensor devices , 2005, FAST'05.

[14]  Úlfar Erlingsson,et al.  A cool and practical alternative to traditional hash tables , 2006 .

[15]  Suman Nath,et al.  FlashDB: Dynamic Self-tuning Database for NAND Flash , 2007, 2007 6th International Symposium on Information Processing in Sensor Networks.

[16]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[17]  Li-Pin Chang,et al.  On efficient wear leveling for large-scale flash-memory storage systems , 2007, SAC '07.

[18]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[19]  Sebastiano Vigna,et al.  Theory and practice of monotone minimal perfect hashing , 2011, JEAL.

[20]  Suman Nath,et al.  Online maintenance of very large random samples on flash storage , 2008, The VLDB Journal.

[21]  Eddie Kohler,et al.  Modular data storage with Anvil , 2009, SOSP '09.

[22]  Milo Polte,et al.  Enabling Enterprise Solid State Disks Performance , 2009 .

[23]  Amar Phanishayee,et al.  FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[24]  Larry L. Peterson,et al.  HashCache: Cache Storage for the Next Billion , 2009, NSDI.

[25]  Sebastiano Vigna,et al.  Monotone minimal perfect hashing: searching a sorted table with O(1) accesses , 2009, SODA.

[26]  Martin Dietzfelbinger,et al.  Hash, Displace, and Compress , 2009, ESA.

[27]  Sebastiano Vigna,et al.  Theory and Practise of Monotone Minimal Perfect Hashing , 2009, ALENEX.

[28]  Sanjeev Kumar,et al.  Finding a Needle in Haystack: Facebook's Photo Storage , 2010, OSDI.

[29]  Asim Kadav,et al.  Differential RAID: rethinking RAID for SSD reliability , 2010, OPSR.

[30]  Asim Kadav,et al.  Differential RAID: Rethinking RAID for SSD reliability , 2010, ACM Trans. Storage.

[31]  Jin Li,et al.  FlashStore: High Throughput Persistent Key-Value Store , 2010, Proc. VLDB Endow..

[32]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[33]  Jin Li,et al.  SkimpyStash: RAM space skimpy key-value store on flash-based storage , 2011, SIGMOD '11.

[34]  Anísio Lacerda,et al.  Minimal perfect hashing: A competitive method for indexing internal memory , 2011, Inf. Sci..

[35]  Abhinav Dutta Cheap and Large CAMs for High Performance Data-Intensive Networked Systems , .