A Shifting Bloom Filter Framework for Set Queries

Set queries are fundamental operations in computer systems and applications. This paper addresses the fundamental problem of designing a probabilistic data structure that can quickly process set queries using a small amount of memory. We propose a Shifting Bloom Filter (ShBF) framework for representing and querying sets. We demonstrate the effectiveness of ShBF using three types of popular set queries: membership, association, and multiplicity queries. The key novelty of ShBF is on encoding the auxiliary information of a set element in a location offset. In contrast, prior BF based set data structures allocate additional memory to store auxiliary information. We conducted experiments using real-world network traces, and results show that ShBF significantly advances the state-of-the-art on all three types of set queries.

[1]  George Varghese,et al.  Hash-Based Techniques for High-Speed Packet Processing , 2010, Algorithms for Next Generation Networks.

[2]  Yossi Matias,et al.  Spectral bloom filters , 2003, SIGMOD '03.

[3]  Michael Mitzenmacher,et al.  Less hashing, same performance: Building a better Bloom filter , 2006, Random Struct. Algorithms.

[4]  Sasu Tarkoma,et al.  Theory and Practice of Bloom Filters for Distributed Systems , 2012, IEEE Communications Surveys & Tutorials.

[5]  Bin Fan,et al.  Cuckoo Filter: Practically Better Than Bloom , 2014, CoNEXT.

[6]  Bernard Chazelle,et al.  The Bloomier filter: an efficient data structure for static support lookup tables , 2004, SODA '04.

[7]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[8]  MyungKeun Yoon,et al.  Bloom tree: A search tree based on Bloom filters for multiple-set membership testing , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[9]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[10]  Tian He,et al.  kBF: A Bloom Filter for key-value storage with an application on approximate state machines , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[11]  Balaji Prabhakar,et al.  Bloom filters: Design innovations and novel applications , 2005 .

[12]  Michiel H. M. Smid,et al.  On the false-positive rate of Bloom filters , 2008, Inf. Process. Lett..

[13]  XieGaogang,et al.  A shifting bloom filter framework for set queries , 2016, VLDB 2016.

[14]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[15]  Shigang Chen,et al.  One memory access bloom filters and their generalization , 2011, 2011 Proceedings IEEE INFOCOM.

[16]  Josep-Lluís Larriba-Pey,et al.  Dynamic count filters , 2006, SGMD.

[17]  Myungjin Lee,et al.  MAPLE: a scalable architecture for maintaining packet latency measurements , 2012, IMC '12.

[18]  Fang Hao,et al.  Fast Multiset Membership Testing Using Combinatorial Bloom Filters , 2009, IEEE INFOCOM 2009.

[19]  Kenneth J. Christensen,et al.  A new analysis of the false positive rate of a Bloom filter , 2010, Inf. Process. Lett..

[20]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..