Shared scans on main memory column stores

Column stores and shared scans have been found to be effective techniques in order to improve performance for many workloads. Another recent hardware trend makes it possible to keep most data in main memory. This paper builds upon these trends and explores how to implement shared scans on column stores in main memory efficiently. In particular, this paper proposes new approaches to avoid unnecessary work and to best implement position lists in such a query processing architecture. Performance experiments with real workloads from the travel industry show the advantages of combining column stores and shared scans in main memory over traditional database architectures.

[1]  Martin L. Kersten,et al.  Self-organizing tuple reconstruction in column-stores , 2009, SIGMOD Conference.

[2]  Hong Min,et al.  Improving In-memory Column-Store Database Predicate Evaluation Performance on Multi-core Systems , 2010, 2010 22nd International Symposium on Computer Architecture and High Performance Computing.

[3]  Daniel J. Abadi,et al.  Integrating compression and execution in column-oriented database systems , 2006, SIGMOD Conference.

[4]  George Candea,et al.  A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses , 2009, Proc. VLDB Endow..

[5]  Anastasia Ailamaki,et al.  StagedDB: Designing Database Servers for Modern Hardware , 2005, IEEE Data Eng. Bull..

[6]  Dennis Shasha,et al.  Filtering algorithms and implementation for very fast publish/subscribe systems , 2001, SIGMOD '01.

[7]  Michael Stonebraker,et al.  The Case for Shared Nothing , 1985, HPTS.

[8]  David J. DeWitt,et al.  A Comparison of C-Store and Row-Store in a Common Framework , 2006 .

[9]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[10]  Daniel J. Abadi,et al.  Column-stores vs. row-stores: how different are they really? , 2008, SIGMOD Conference.

[11]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[12]  S. B. Yao,et al.  Approximating block accesses in database organizations , 1977, CACM.

[13]  Carsten Binnig,et al.  Dictionary-based order-preserving string compression for main memory column stores , 2009, SIGMOD Conference.

[14]  Henrik Loeser,et al.  "One Size Fits All": An Idea Whose Time Has Come and Gone? , 2011, BTW.

[15]  Martin L. Kersten,et al.  An architecture for recycling intermediates in a column-store , 2009, SIGMOD Conference.

[16]  Gustavo Alonso,et al.  Predictable Performance for Unpredictable Workloads , 2009, Proc. VLDB Endow..

[17]  Daniel J. Abadi,et al.  Column oriented Database Systems , 2009, Proc. VLDB Endow..

[18]  Volker Markl,et al.  Parallelizing query optimization , 2008, Proc. VLDB Endow..

[19]  Mark Nelson C++ Program Guide to Standard Template Library , 1995 .

[20]  Marcin Zukowski,et al.  Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS , 2007, VLDB.

[21]  Marcin Zukowski,et al.  MonetDB/X100 - A DBMS In The CPU Cache , 2005, IEEE Data Eng. Bull..

[22]  James R. Hamilton Internet scale storage , 2011, SIGMOD '11.

[23]  David J. DeWitt,et al.  Materialization Strategies in a Column-Oriented DBMS , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[24]  Phillip M. Fernandez Red brick warehouse: a read-mostly RDBMS for open SMP platforms , 1994, SIGMOD '94.

[25]  Frederick Reiss,et al.  Main-memory scan sharing for multi-core CPUs , 2008, Proc. VLDB Endow..

[26]  Setrag Khoshafian,et al.  A decomposition storage model , 1985, SIGMOD Conference.

[27]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[28]  Andrey Gubarev,et al.  Dremel : Interactive Analysis of Web-Scale Datasets , 2011 .