Retouched Bloom Filters: Allowing Networked Applications to Flexibly Trade Off False Positives Against False Negatives

Where distributed agents must share voluminous set membership information, Bloom filters provide a compact, though lossy, way for them to do so. Numerous recent networking papers have examined the trade-offs between the bandwidth consumed by the transmission of Bloom filters, and the error rate, which takes the form of false positives, and which rises the more the filters are compressed. In this paper, we introduce the retouched Bloom filter (RBF), an extension that makes the Bloom filter more flexible by permitting the removal of selected false positives at the expense of generating random false negatives. We analytically show that RBFs created through a random process maintain an overall error rate, expressed as a combination of the false positive rate and the false negative rate, that is equal to the false positive rate of the corresponding Bloom filters. We further provide some simple heuristics and improved algorithms that decrease the false positive rate more than than the corresponding increase in the false negative rate, when creating RBFs. Finally, we demonstrate the advantages of an RBF over a Bloom filter in a distributed network topology measurement application, where information about large stop sets must be shared among route tracing monitors.

[1]  M. Mitzenmacher,et al.  Beyond bloom filters: from approximate membership checks to approximate state machines , 2006, SIGCOMM.

[2]  Michael Mitzenmacher,et al.  Digital Fountains and Their Application to Informed Content Delivery over Adaptive Overlay Networks , 2005, DISC.

[3]  Haoyu Song,et al.  Fast hash table lookup using extended bloom filter: an aid to network processing , 2005, SIGCOMM '05.

[4]  Eran Shir,et al.  DIMES: let the internet measure itself , 2005, CCRV.

[5]  Mark Crovella,et al.  Improved Algorithms for Network Topology Discovery , 2005, PAM.

[6]  Mark Crovella,et al.  Efficient algorithms for large-scale topology discovery , 2004, SIGMETRICS '05.

[7]  Alessandro Vespignani,et al.  A statistical approach to the traceroute-like exploration of networks: theory and simulations , 2004, ArXiv.

[8]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[9]  Richard P. Martin,et al.  PlanetP: using gossiping to build content addressable peer-to-peer information sharing communities , 2003, High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on.

[10]  John Kubiatowicz,et al.  Probabilistic location and routing , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[11]  K. Claffy,et al.  Topology discovery by active probing , 2002, Proceedings 2002 Symposium on Applications and the Internet (SAINT) Workshops.

[12]  Michael Mitzenmacher,et al.  Compressed bloom filters , 2001, PODC '01.

[13]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[14]  Ben Y. Zhao,et al.  An architecture for a secure service discovery service , 1999, MobiCom.

[15]  James K. Mullin,et al.  A tale of three spelling checkers , 1990, Softw. Pract. Exp..

[16]  Kjell Bratbergsengen,et al.  Hashing Methods and Relational Algebra Operations , 1984, VLDB.

[17]  Patrick Valduriez,et al.  Join and Semijoin Algorithms for a Multiprocessor Database Machine , 1984, TODS.

[18]  James K. Mullin,et al.  A second look at bloom filters , 1983, CACM.

[19]  Lee L. Gremillion Designing a Bloom filter for differential file access , 1982, CACM.

[20]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[21]  Igor M. Moraes,et al.  A New IP Traceback System Against Distributed Denial-of-Service Attacks , 2005 .

[22]  John W. Lockwood,et al.  Deep packet inspection using parallel bloom filters , 2004, IEEE Micro.

[23]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[24]  M. D. McIlroy,et al.  Development of a Spelling List , 1982, IEEE Trans. Commun..