BitWeaving: fast scans for main memory data processing

This paper focuses on running scans in a main memory data processing system at "bare metal" speed. Essentially, this means that the system must aim to process data at or near the speed of the processor (the fastest component in most system configurations). Scans are common in main memory data processing environments, and with the state-of-the-art techniques it still takes many cycles per input tuple to apply simple predicates on a single column of a table. In this paper, we propose a technique called BitWeaving that exploits the parallelism available at the bit level in modern processors. BitWeaving operates on multiple bits of data in a single cycle, processing bits from different columns in each cycle. Thus, bits from a batch of tuples are processed in each cycle, allowing BitWeaving to drop the cycles per column to below one in some case. BitWeaving comes in two flavors: BitWeaving/V which looks like a columnar organization but at the bit level, and BitWeaving/H which packs bits horizontally. In this paper we also develop the arithmetic framework that is needed to evaluate predicates using these BitWeaving organizations. Our experimental results show that both these methods produce significant performance benefits over the existing state-of-the-art methods, and in some cases produce over an order of magnitude in performance improvement.

[1]  Patrick E. O'Neil,et al.  Improved query performance with variant indexes , 1997, SIGMOD '97.

[2]  Bingsheng He,et al.  Database compression on graphics processors , 2010, Proc. VLDB Endow..

[3]  Pradeep Dubey,et al.  Fast Updates on Read-Optimized Databases Using Multi-Core CPUs , 2011, Proc. VLDB Endow..

[4]  Patrick E. O'Neil,et al.  Bit-sliced index arithmetic , 2001, SIGMOD '01.

[5]  Alexander Zeier,et al.  SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units , 2009, Proc. VLDB Endow..

[6]  Alexander Zeier,et al.  HYRISE - A Main Memory Hybrid Storage Engine , 2010, Proc. VLDB Endow..

[7]  Marcin Zukowski,et al.  Vectorwise: A Vectorized Analytical DBMS , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[8]  Kenneth A. Ross,et al.  Implementing database operations using SIMD instructions , 2002, SIGMOD '02.

[9]  Marcin Zukowski,et al.  Super-Scalar RAM-CPU Cache Compression , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[10]  Martin L. Kersten,et al.  MonetDB: Two Decades of Research in Column-oriented Database Architectures , 2012, IEEE Data Eng. Bull..

[11]  Norman May,et al.  The SAP HANA Database -- An Architecture Overview , 2012, IEEE Data Eng. Bull..

[12]  Jae-Gil Lee,et al.  Business Analytics in (a) Blink , 2012, IEEE Data Eng. Bull..

[13]  Ryan Johnson,et al.  Row-wise parallel predicate evaluation , 2008, Proc. VLDB Endow..

[14]  Raghunath Othayoth Nambiar,et al.  Shaping the Landscape of Industry Standard Benchmarks: Contributions of the Transaction Processing Performance Council (TPC) , 2011, TPCTC.

[15]  Leslie Lamport,et al.  Multiple byte processing with full-word instructions , 1975, Commun. ACM.

[16]  Viktor Leis,et al.  HyPer: Adapting Columnar Main-Memory Data Management for Transactional AND Query Processing , 2012, IEEE Data Eng. Bull..

[17]  Daniel J. Abadi,et al.  Integrating compression and execution in column-oriented database systems , 2006, SIGMOD Conference.

[18]  Frederick Reiss,et al.  Constant-Time Query Processing , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[19]  Carsten Binnig,et al.  Dictionary-based order-preserving string compression for main memory column stores , 2009, SIGMOD Conference.