Column Imprints is a pre-filtering secondary index for answering range queries. The main feature of imprints is that they are light-weight and are based on compressed bit-vectors, one per cacheline, that quickly determine if the values in that cacheline satisfy the predicates of a query. The main overhead of the imprints implementation is the many sequential value comparisons against the boundaries of a virtual equi-height histogram. Similarly, during query scans, many sequential value comparisons are performed to identify false positives. In this paper, we speed-up the process of imprints creation and querying by using advanced vectorization techniques. We also experimentally explore the benefits of stretching imprints to larger bit-vector sizes and blocks of data, using 256-bit SIMD registers. Our findings are very promising for both imprints and for future index design research that would employ advanced vectorization techniques and larger (up to 512-bit) and more (from 16 now to 32) SIMD registers.
[1]
Kenneth A. Ross,et al.
Rethinking SIMD Vectorization for In-Memory Databases
,
2015,
SIGMOD Conference.
[2]
Arie Shoshani,et al.
Optimizing bitmap indices with efficient compression
,
2006,
TODS.
[3]
Jignesh M. Patel,et al.
BitWeaving: fast scans for main memory data processing
,
2013,
SIGMOD '13.
[4]
Kenneth A. Ross,et al.
Implementing database operations using SIMD instructions
,
2002,
SIGMOD '02.
[5]
Alfons Kemper,et al.
Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation
,
2016,
SIGMOD Conference.
[6]
Pradeep Dubey,et al.
Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs
,
2009,
Proc. VLDB Endow..