The Variable-Increment Counting Bloom Filter

Counting Bloom Filters (CBFs) are widely used in networking device algorithms. They implement fast set representations to support membership queries with limited error and support element deletions unlike Bloom Filters. However, they consume significant amounts of memory. In this paper, we introduce a new general method based on variable increments to improve the efficiency of CBFs and their variants. Unlike CBFs, at each element insertion, the hashed counters are incremented by a hashed variable increment instead of a unit increment. Then, to query an element, the exact value of a counter is considered and not just its positiveness. We present two simple schemes based on this method. We demonstrate that this method can always achieve a lower false positive rate and a lower overflow probability bound than CBF in practical systems. We also show how it can be easily implemented in hardware, with limited added complexity and memory overhead. We further explain how this method can extend many variants of CBF that have been published in the literature. We then suggest possible improvements of the presented schemes and provide lower bounds on their memory consumption. Lastly, using simulations with real-life traces and hash functions, we show how it can significantly improve the false positive rate of CBFs given the same amount of memory.

[1]  George Varghese,et al.  An Improved Construction for Counting Bloom Filters , 2006, ESA.

[2]  Isaac Keslassy,et al.  Access-efficient Balanced Bloom Filters , 2012, 2012 IEEE International Conference on Communications (ICC).

[3]  M. Mitzenmacher,et al.  Beyond bloom filters: from approximate membership checks to approximate state machines , 2006, SIGCOMM.

[4]  Andrea Montanari,et al.  Counter braids: a novel counter architecture for per-flow measurement , 2008, SIGMETRICS '08.

[5]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[6]  KeslassyIsaac,et al.  The variable-increment counting bloom filter , 2014 .

[7]  Stefano Giordano,et al.  MultiLayer Compressed Counting Bloom Filters , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[8]  K. O'Bryant A Complete Annotated Bibliography of Work Related to Sidon Sequences , 2004, math/0407117.

[9]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[10]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[11]  Ron M. Roth,et al.  Location-correcting codes , 1996, IEEE Trans. Inf. Theory.

[12]  Michael Mitzenmacher,et al.  Compressed bloom filters , 2001, PODC '01.

[13]  George Varghese,et al.  Hash-Based Techniques for High-Speed Packet Processing , 2010, Algorithms for Next Generation Networks.

[14]  Xingde Jia Bh[g]-Sequences with Large Upper Density , 1996 .

[15]  Isaac Keslassy,et al.  The Variable-Increment Counting Bloom Filter , 2014, IEEE/ACM Trans. Netw..

[16]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[17]  Sarang Dharmapurikar,et al.  Longest prefix matching using bloom filters , 2006, IEEE/ACM Transactions on Networking.

[18]  Fang Hao,et al.  IPv6 Lookups using Distributed and Load Balanced Bloom Filters for 100Gbps Core Router Line Cards , 2009, IEEE INFOCOM 2009.

[19]  Yossi Matias,et al.  Spectral bloom filters , 2003, SIGMOD '03.

[20]  Sean Quinlan,et al.  Venti: A New Approach to Archival Storage , 2002, FAST.

[21]  Peter Sanders,et al.  Cache-, hash-, and space-efficient bloom filters , 2009, JEAL.

[22]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[23]  Stefano Giordano,et al.  Enhancing Counting Bloom Filters Through Huffman-Coded Multilayer Structures , 2010, IEEE/ACM Transactions on Networking.

[24]  Julong Lan,et al.  A variable length counting Bloom filter , 2010, 2010 2nd International Conference on Computer Engineering and Technology.

[25]  Arya Mazumdar,et al.  Codes in Permutations and Error Correction for Rank Modulation , 2009, IEEE Transactions on Information Theory.

[26]  Isaac Keslassy,et al.  The Bloom Paradox: When Not to Use a Bloom Filter , 2015, IEEE/ACM Transactions on Networking.