论文信息 - Row-wise parallel predicate evaluation

Row-wise parallel predicate evaluation

Table scans have become more interesting recently due to greater use of ad-hoc queries and greater availability of multi-core, vector-enabled hardware. Table scan performance is limited by value representation, table layout, and processing techniques. In this paper we propose a new layout and processing technique for efficient one-pass predicate evaluation. Starting with a set of rows with a fixed number of bits per column, we append columns to form a set of banks and then pad each bank to a supported machine word length, typically 16, 32, or 64 bits. We then evaluate partial predicates on the columns of each bank, using a novel evaluation strategy that evaluates column level equality, range tests, IN-list predicates, and conjuncts of these predicates, simultaneously on multiple columns within a bank, and on multiple rows within a machine register. This approach outperforms pure column stores, which must evaluate the partial predicates one column at a time. We evaluate and compare the performance and representation overhead of this new approach and several proposed alternatives.

[1] Kenneth A. Ross,et al. Implementing database operations using SIMD instructions , 2002, SIGMOD '02.

[2] György Dósa,et al. The Tight Bound of First Fit Decreasing Bin-Packing Algorithm Is FFD(I) <= 11/9OPT(I) + 6/9 , 2007, ESCAPE.

[3] Daniel J. Abadi,et al. Integrating compression and execution in column-oriented database systems , 2006, SIGMOD Conference.

[4] Marcin Zukowski,et al. Super-Scalar RAM-CPU Cache Compression , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[5] Ramesh C. Agarwal,et al. Block oriented processing of relational database operations in modern computer architectures , 2001, Proceedings 17th International Conference on Data Engineering.

[6] Garret Swart,et al. How to wring a table dry: entropy compression of relations and querying of compressed relations , 2006, VLDB.

[7] Marcin Zukowski,et al. MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[8] Paul Zikopoulos,et al. IBM DB2 9 New Features , 2007 .

[9] Roger MacNicol,et al. Sybase IQ Multiplex - Designed For Analytics , 2004, VLDB.

[10] Frederick Reiss,et al. Constant-Time Query Processing , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[11] David J. DeWitt,et al. Data page layouts for relational databases on deep memory hierarchies , 2002, The VLDB Journal.

[12] Meikel Pöss,et al. Data Compression in Oracle , 2003, VLDB.

[13] Martin L. Kersten,et al. Database Architecture Optimized for the New Bottleneck: Memory Access , 1999, VLDB.