A General-Purpose Counting Filter: Making Every Bit Count

Approximate Membership Query (AMQ) data structures, such as the Bloom filter, quotient filter, and cuckoo filter, have found numerous applications in databases, storage systems, networks, computational biology, and other domains. However, many applications must work around limitations in the capabilities or performance of current AMQs, making these applications more complex and less performant. For example, many current AMQs cannot delete or count the number of occurrences of each input item, take up large amounts of space, are slow, cannot be resized or merged, or have poor locality of reference and hence perform poorly when stored on SSD or disk. This paper proposes a new general-purpose AMQ, the counting quotient filter (CQF). The CQF supports approximate membership testing and counting the occurrences of items in a data set. This general-purpose AMQ is small and fast, has good locality of reference, scales out of RAM to SSD, and supports deletions, counting (even on skewed data sets), resizing, merging, and highly concurrent access. The paper reports on the structure's performance on both manufactured and application-generated data sets. In our experiments, the CQF performs in-memory inserts and queries up to an order-of magnitude faster than the original quotient filter, several times faster than a Bloom filter, and similarly to the cuckoo filter, even though none of these other data structures support counting. On SSD, the CQF outperforms all structures by a factor of at least 2 because the CQF has good data locality. The CQF achieves these performance gains by restructuring the metadata bits of the quotient filter to obtain fast lookups at high load factors (i.e., even when the data structure is almost full). As a result, the CQF offers good lookup performance even up to a load factor of 95%. Counting is essentially free in the CQF in the sense that the structure is comparable or more space efficient even than non-counting data structures (e.g., Bloom, quotient, and cuckoo filters). The paper also shows how to speed up CQF operations by using new x86 bit-manipulation instructions introduced in Intel's Haswell line of processors. The restructured metadata transforms many quotient filter metadata operations into rank-and-select bit-vector operations. Thus, our efficient implementations of rank and select may be useful for other rank-and-select-based data structures.

[1]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[2]  Leonid Oliker,et al.  Parallel De Bruijn Graph Construction and Traversal for De Novo Genome Assembly , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[3]  Kenneth A. Ross,et al.  Buffered Bloom Filters on Solid State Storage , 2010, ADMS@VLDB.

[4]  David Hutchison,et al.  Scalable Bloom Filters , 2007, Inf. Process. Lett..

[5]  Jason Cong,et al.  An efficient design and implementation of LSM-tree based key-value store on open-channel SSD , 2014, EuroSys '14.

[6]  Qingpeng Zhang,et al.  These Are Not the K-mers You Are Looking For: Efficient Online K-mer Counting Using a Probabilistic Data Structure , 2013, PloS one.

[7]  Graham Cormode,et al.  An improved data stream summary: the count-min sketch and its applications , 2004, J. Algorithms.

[8]  Yossi Matias,et al.  Spectral bloom filters , 2003, SIGMOD '03.

[9]  R. González,et al.  PRACTICAL IMPLEMENTATION OF RANK AND SELECT QUERIES , 2005 .

[10]  Bin Fan,et al.  Cuckoo Filter: Practically Better Than Bloom , 2014, CoNEXT.

[11]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[12]  Jin Li,et al.  ChunkStash: Speeding Up Inline Storage Deduplication Using Flash Memory , 2010, USENIX Annual Technical Conference.

[13]  Michael A. Bender,et al.  Don't Thrash: How to Cache Your Hash on Flash , 2011, Proc. VLDB Endow..

[14]  David Hung-Chang Du,et al.  BloomFlash: Bloom Filter on Flash-Based Storage , 2011, 2011 31st International Conference on Distributed Computing Systems.

[15]  B. Corominas-Murtra,et al.  Universality of Zipf's law. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Michael A. Bender,et al.  Insertion Sort is O(n log n) , 2005, Theory of Computing Systems.

[17]  Shigang Chen,et al.  Fast Bloom Filters and Their Generalization , 2014, IEEE Transactions on Parallel and Distributed Systems.

[18]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[19]  Sasu Tarkoma,et al.  Theory and Practice of Bloom Filters for Distributed Systems , 2012, IEEE Communications Surveys & Tutorials.

[20]  Chen Li,et al.  Storage Management in AsterixDB , 2014, Proc. VLDB Endow..

[21]  David Hung-Chang Du,et al.  A Forest-structured Bloom Filter with flash memory , 2011, 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).

[22]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[23]  George Varghese,et al.  An Improved Construction for Counting Bloom Filters , 2006, ESA.

[24]  John D. Owens,et al.  Quotient Filters: Approximate Membership Queries on the GPU , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[25]  Páll Melsted,et al.  Efficient counting of k-mers in DNA sequences using a bloom filter , 2011, BMC Bioinformatics.

[26]  Graham Cormode,et al.  Mergeable summaries , 2012, PODS '12.

[27]  Peter Sanders,et al.  Cache-, hash-, and space-efficient bloom filters , 2009, JEAL.

[28]  Kai Li,et al.  Avoiding the Disk Bottleneck in the Data Domain Deduplication File System , 2008, FAST.

[29]  Alexander Schliep,et al.  Turtle: Identifying frequent k-mers with cache-efficient algorithms , 2013, Bioinform..