K-Divided Bloom Filter Algorithm and Its Analysis

By using a bit vector and a set of hash functions to represent data set, bloom filter can query a given data effectively. Bloom filter can be used to determine an element belongs to data set or not. Split bloom filter is amelioration to the bloom filter, which use a S times N bit matrix to represent data set. In distributed systems, if the number of the elements increases continually, the increasing error rate of bloom filter will make the representation nonsensically. Split bloom filter can only weaken this problem. In this paper, a new kind of bloom filter, named as K-divided bloom filter, is presented. Compared with split bloom filter, it can reduce space and time spending and has a resembling or better performance. K-divided bloom filter gets better tradeoff among error rate, space and time.

[1]  Xiao Ming Split Bloom Filter , 2004 .

[2]  Wei-Pang Yang,et al.  Random filter and its analysis , 1989, Twenty-Third Asilomar Conference on Signals, Systems and Computers, 1989..

[3]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[4]  Yossi Matias,et al.  Spectral bloom filters , 2003, SIGMOD '03.

[5]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[6]  Michael Mitzenmacher,et al.  Compressed bloom filters , 2001, PODC '01.