Probabilistic Data Structures in Adversarial Environments

Probabilistic data structures use space-efficient representations of data in order to (approximately) respond to queries about the data. Traditionally, these structures are accompanied by probabilistic bounds on query-response errors. These bounds implicitly assume benign attack models, in which the data and the queries are inputs are chosen non-adaptively, and independent of the randomness used to construct the representation. Yet probabilistic data structures are increasingly used in settings where these assumptions may be violated. This work provides a provable security treatment of probabilistic data structures in adversarial environments. We give a syntax that captures a wide variety of in-use structures, and our security notions support development of error bounds in the presence of powerful attacks. Concretely, we primarily focus on examining the widely used Bloom filter, but also consider counting (Bloom) filters and count-min sketch data structures. For the traditional version of these, our security findings are largely negative; however, we show that simple embellishments (e.g., using salts, or secret keys) yields structures that provide provable security, and with little overhead.

[1]  Dan S. Wallach,et al.  Denial of Service via Algorithmic Complexity Attacks , 2003, USENIX Security Symposium.

[2]  Amin Vahdat,et al.  Efficient Peer-to-Peer Keyword Searching , 2003, Middleware.

[3]  Michael Mitzenmacher,et al.  Less Hashing, Same Performance: Building a Better Bloom Filter , 2006, ESA.

[4]  Youki Kadobayashi,et al.  Cryptographically Secure Bloom-Filters , 2009, Trans. Data Priv..

[5]  Cédric Lauradoux,et al.  The Power of Evil Choices in Bloom Filters , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[6]  Mihir Bellare,et al.  Random oracles are practical: a paradigm for designing efficient protocols , 1993, CCS '93.

[7]  Mihir Bellare,et al.  The Security of Triple Encryption and a Framework for Code-Based Game-Playing Proofs , 2006, EUROCRYPT.

[8]  Bruce M. Maggs,et al.  CRLite: A Scalable System for Pushing All TLS Revocations to All Browsers , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[9]  Bill Cheswick,et al.  Privacy-Enhanced Searches Using Encrypted Bloom Filters , 2004, IACR Cryptol. ePrint Arch..

[10]  Ghassan O. Karame,et al.  On the privacy provisions of Bloom filters in lightweight bitcoin clients , 2014, IACR Cryptol. ePrint Arch..

[11]  Moni Naor,et al.  Bloom Filters in Adversarial Environments , 2014, CRYPTO.

[12]  L FredmanMichael,et al.  Storing a Sparse Table with 0(1) Worst Case Access Time , 1984 .

[13]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[14]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[15]  Jeffrey F. Naughton,et al.  Clocked adversaries for hashing , 1993, Algorithmica.

[16]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[17]  Martin Dietzfelbinger,et al.  Succinct Data Structures for Retrieval and Approximate Membership , 2008, ICALP.

[18]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[19]  Kang G. Shin,et al.  Stochastic fair blue: a queue management algorithm for enforcing fairness , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[20]  Rainer Schnell,et al.  A Novel Error-Tolerant Anonymous Linking Code , 2011 .

[21]  Sasu Tarkoma,et al.  Theory and Practice of Bloom Filters for Distributed Systems , 2012, IEEE Communications Surveys & Tutorials.

[22]  Fan Deng,et al.  Approximately detecting duplicates for streaming data using stable bloom filters , 2006, SIGMOD Conference.

[23]  Bin Fan,et al.  Cuckoo Filter: Practically Better Than Bloom , 2014, CoNEXT.

[24]  Moni Naor,et al.  Sketching in adversarial environments , 2008, STOC.

[25]  Bernard Chazelle,et al.  The Bloomier filter: an efficient data structure for static support lookup tables , 2004, SODA '04.

[26]  Jeffrey Considine,et al.  Informed content delivery across adaptive overlay networks , 2002, IEEE/ACM Transactions on Networking.

[27]  P. Flajolet,et al.  Loglog counting of large cardinalities , 2003 .

[28]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..