Difference Bloom Filter: A probabilistic structure for multi-set membership query

Given v sets and an incoming item e, multi-set membership query is to report which set contains item e. Multi-set membership query is a fundamental problem in computer systems and applications. All existing data structures cannot achieve small memory usage, fast query speed and high accuracy at the same time. In this paper, we propose a novel probabilistic data structure named Difference Bloom Filter (DBF) for fast multi-set membership query, which not only is more accurate than the state-of-the-art, but has a faster query speed. There are two key design principles for DBF. The first one is to make the representation of the membership of elements exclusive by writing different number of 1s and 0s in the same filter, and the second one is to use the slow but cheap DRAM memory to improve the accuracy of the filter on the fast but expensive SRAM memory. Experimental results show that in terms of accuracy, DBF has a great advantage compared to state-of-the-art, being hundreds of times more accurate than the state-of-the-art vBF and ShBF. Furthermore, we have made the source code of our DBF available at our homepage [1] and GitHub [2].

[1]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[2]  Kumar Chellapilla,et al.  Bloomier Filters: A second look , 2008, ESA.

[3]  MyungKeun Yoon,et al.  Bloom tree: A search tree based on Bloom filters for multiple-set membership testing , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[4]  Gaogang Xie,et al.  A Shifting Bloom Filter Framework for Set Queries , 2015, Proc. VLDB Endow..

[5]  Qi Li,et al.  Guarantee IP lookup performance with FIB explosion , 2015, SIGCOMM 2015.

[6]  John W. Lockwood,et al.  Deep packet inspection using parallel bloom filters , 2004, IEEE Micro.

[7]  Andreas Herkersdorf,et al.  Technologies and building blocks for fast packet forwarding , 2001 .

[8]  Friedhelm Meyer auf der Heide,et al.  Dynamic perfect hashing: upper and lower bounds , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[9]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[10]  Tian He,et al.  kBF: A Bloom Filter for key-value storage with an application on approximate state machines , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[11]  Balaji Prabhakar,et al.  Bloom filters: Design innovations and novel applications , 2005 .

[12]  Shigang Chen,et al.  When Bloom Filters Are No Longer Compact: Multi-Set Membership Lookup for Network Applications , 2016, IEEE/ACM Transactions on Networking.

[13]  Kang Li,et al.  Approximate caches for packet classification , 2004, IEEE INFOCOM 2004.

[14]  Fang Hao,et al.  Fast Multiset Membership Testing Using Combinatorial Bloom Filters , 2009, INFOCOM.

[15]  Minlan Yu,et al.  BUFFALO: bloom filter forwarding architecture for large organizations , 2009, CoNEXT '09.

[16]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.