DB2 with BLU Acceleration: So Much More than Just a Column Store

DB2 with BLU Acceleration deeply integrates innovative new techniques for defining and processing column-organized tables that speed read-mostly Business Intelligence queries by 10 to 50 times and improve compression by 3 to 10 times, compared to traditional row-organized tables, without the complexity of defining indexes or materialized views on those tables. But DB2 BLU is much more than just a column store. Exploiting frequency-based dictionary compression and main-memory query processing technology from the Blink project at IBM Research - Almaden, DB2 BLU performs most SQL operations - predicate application (even range predicates and IN-lists), joins, and grouping - on the compressed values, which can be packed bit-aligned so densely that multiple values fit in a register and can be processed simultaneously via SIMD (single-instruction, multipledata) instructions. Designed and built from the ground up to exploit modern multi-core processors, DB2 BLU's hardware-conscious algorithms are carefully engineered to maximize parallelism by using novel data structures that need little latching, and to minimize data-cache and instruction-cache misses. Though DB2 BLU is optimized for in-memory processing, database size is not limited by the size of main memory. Fine-grained synopses, late materialization, and a new probabilistic buffer pool protocol for scans minimize disk I/Os, while aggressive prefetching reduces I/O stalls. Full integration with DB2 ensures that DB2 with BLU Acceleration benefits from the full functionality and robust utilities of a mature product, while still enjoying order-of-magnitude performance gains from revolutionary technology without even having to change the SQL, and can mix column-organized and row-organized tables in the same tablespace and even within the same query.

[1]  Patricia G. Selinger,et al.  Access path selection in a relational database management system , 1979, SIGMOD '79.

[2]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[3]  Setrag Khoshafian,et al.  A decomposition storage model , 1985, SIGMOD Conference.

[4]  Hamid Pirahesh,et al.  Starburst Mid-Flight: As the Dust Clears , 1990, IEEE Trans. Knowl. Data Eng..

[5]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[6]  David J. DeWitt,et al.  Weaving Relations for Cache Performance , 2001, VLDB.

[7]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[8]  David J. DeWitt,et al.  Materialization Strategies in a Column-Oriented DBMS , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[9]  David J. DeWitt,et al.  How to barter bits for chronons: compression and bandwidth trade offs for database scans , 2007, SIGMOD '07.

[10]  Ryan Johnson,et al.  Row-wise parallel predicate evaluation , 2008, Proc. VLDB Endow..

[11]  Frederick Reiss,et al.  Constant-Time Query Processing , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[12]  Martin L. Kersten,et al.  Breaking the memory wall in MonetDB , 2008, CACM.

[13]  Vijayshankar Raman,et al.  Bringing BLINK Closer to the Full Power of SQL , 2009, BTW.

[14]  Alexander Zeier,et al.  SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units , 2009, Proc. VLDB Endow..

[15]  Jae-Gil Lee,et al.  Blink: Not Your Father's Database! , 2011, BIRTE.

[16]  Alfons Kemper,et al.  Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems , 2012, Proc. VLDB Endow..

[17]  Norman May,et al.  The SAP HANA Database -- An Architecture Overview , 2012, IEEE Data Eng. Bull..

[18]  Jae-Gil Lee,et al.  Business Analytics in (a) Blink , 2012, IEEE Data Eng. Bull..

[19]  Marcin Zukowski,et al.  Vectorwise: Beyond Column Stores , 2012, IEEE Data Eng. Bull..

[20]  Ippokratis Pandis,et al.  NUMA-aware algorithms: the case of data shuffling , 2013, CIDR.

[21]  Campbell Fraser,et al.  Enhancements to SQL server column stores , 2013, SIGMOD '13.