Invertible bloom lookup tables

We present a version of the Bloom filter data structure that supports not only the insertion, deletion, and lookup of key-value pairs, but also allows a complete listing of the pairs it contains with high probability, as long the number of key-value pairs is below a designed threshold. Our structure allows the number of key-value pairs to greatly exceed this threshold during normal operation. Exceeding the threshold simply temporarily prevents content listing and reduces the probability of a successful lookup. If entries are later deleted to return the structure below the threshold, everything again functions appropriately. We also show that simple variations of our structure are robust to certain standard errors, such as the deletion of a key without a corresponding insertion or the insertion of two distinct values for a key. The properties of our structure make it suitable for several applications, including database and networking applications that we highlight.

[1]  Michael T. Goodrich,et al.  MapReduce Parallel Cuckoo Hashing and Oblivious RAM Simulations , 2010, ArXiv.

[2]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[3]  Noga Alon,et al.  k-Wise Independent Random Graphs , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[4]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[5]  Peter Sanders,et al.  Cache-, hash-, and space-efficient bloom filters , 2009, JEAL.

[6]  Haoyu Song,et al.  Fast hash table lookup using extended bloom filter: an aid to network processing , 2005, SIGCOMM '05.

[7]  Michael Stonebraker In search of database consistency , 2010, Commun. ACM.

[8]  Andrea Montanari,et al.  Tight Thresholds for Cuckoo Hashing via XORSAT , 2009, ICALP.

[9]  Rajeev Raman,et al.  Succinct Dynamic Dictionaries and Trees , 2003, ICALP.

[10]  Sergey Yekhanin,et al.  Private information retrieval , 2010, CACM.

[11]  Youki Kadobayashi,et al.  Cryptographically Secure Bloom-Filters , 2009, Trans. Data Priv..

[12]  Eyal Kushilevitz,et al.  Private information retrieval , 1998, JACM.

[13]  George Varghese,et al.  Beyond bloom filters: from approximate membership checks to approximate state machines , 2006, SIGCOMM.

[14]  Eric Price,et al.  Efficient sketches for the set query problem , 2010, SODA '11.

[15]  Yossi Matias,et al.  Spectral bloom filters , 2003, SIGMOD '03.

[16]  Andrea Montanari,et al.  Counter braids: a novel counter architecture for per-flow measurement , 2008, SIGMETRICS '08.

[17]  Salvatore J. Stolfo,et al.  Privacy-preserving payload-based correlation for accurate malicious traffic detection , 2006, LSAD '06.

[18]  Michael T. Goodrich,et al.  Privacy-Preserving Access of Outsourced Data via Oblivious RAM Simulation , 2010, ICALP.

[19]  George Varghese,et al.  What's the difference? , 2011, SIGCOMM 2011.

[20]  Rui Wang,et al.  Side-Channel Leaks in Web Applications: A Reality Today, a Challenge Tomorrow , 2010, 2010 IEEE Symposium on Security and Privacy.

[21]  Richard M. Karp,et al.  Finite length analysis of LT codes , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[22]  David Eppstein,et al.  Straggler Identification in Round-Trip Data Streams via Newton's Identities and Invertible Bloom Filters , 2007, IEEE Transactions on Knowledge and Data Engineering.

[23]  Rafail Ostrovsky,et al.  Software protection and simulation on oblivious RAMs , 1996, JACM.

[24]  Moni Naor,et al.  History-Independent Cuckoo Hashing , 2008, ICALP.

[25]  Benny Pinkas,et al.  Oblivious RAM Revisited , 2010, CRYPTO.

[26]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[27]  J. Norris,et al.  Differential equation approximations for Markov chains , 2007, 0710.3269.

[28]  Pat Morin,et al.  Cuckoo hashing: Further analysis , 2003, Inf. Process. Lett..

[29]  Bill Cheswick,et al.  Privacy-Enhanced Searches Using Encrypted Bloom Filters , 2004, IACR Cryptol. ePrint Arch..

[30]  S. Srinivasa Rao,et al.  An optimal Bloom filter replacement , 2005, SODA '05.

[31]  Michael Molloy,et al.  The pure literal rule threshold and cores in random hypergraphs , 2004, SODA '04.

[32]  Eu-Jin Goh,et al.  Secure Indexes , 2003, IACR Cryptol. ePrint Arch..

[33]  Bernard Chazelle,et al.  The Bloomier filter: an efficient data structure for static support lookup tables , 2004, SODA '04.

[34]  Jaideep Vaidya,et al.  Privacy-preserving indexing of documents on the network , 2003, The VLDB Journal.

[35]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[36]  George Varghese,et al.  What's the difference?: efficient set reconciliation without prior context , 2011, SIGCOMM.

[37]  Guy E. Blelloch,et al.  Compact dictionaries for variable-length keys and data with applications , 2008, TALG.

[38]  George Varghese,et al.  An Improved Construction for Counting Bloom Filters , 2006, ESA.

[39]  Daniel A. Spielman,et al.  Efficient erasure correcting codes , 2001, IEEE Trans. Inf. Theory.

[40]  Kai-Min Chung,et al.  Why simple hash functions work: exploiting the entropy in a data stream , 2008, SODA '08.