Building high accuracy bloom filters using partitioned hashing

The growing importance of operations such as packet-content inspection, packet classification based on non-IP headers, maintaining flow-state, etc. has led to increased interest in the networking applications of Bloom filters. This is because Bloom filters provide a relatively easy method for hardware implementation of set-membership queries. However, the tradeoff is that Bloom filters only provide a probabilistic test and membership queries can result in false positives. Ideally, we would like this false positive probability to be very low. The main contribution of this paper is a method for significantly reducing this false positive probability in comparison to existing schemes. This is done by developing a partitioned hashing method which results in a choice of hash functions that set far fewer bits in the Bloom filter bit vector than would be the case otherwise. This lower fill factor of the bit vector translates to a much lower false positive probability. We show experimentally that this improved choice can result in as much as a ten-fold increase in accuracy over standard Bloom filters. We also show that the scheme performs much better than other proposed schemes for improving Bloom filters.