Improving In-memory Column-Store Database Predicate Evaluation Performance on Multi-core Systems

The ability to analyze a large volume of data for the purpose of business intelligence has led to various innovations in database technology. One example is the increased interest of using column-oriented data layout to address query performance in analytical and warehousing workloads. As system architectures move towards multi-core designs, it is important to address optimizing performance for these workloads on these platforms. In this paper we present SPHINX, an architecture that utilizes multi-core systems for search-based predicate evaluation operations in analytical query workloads against in-memory column store. We discuss the natural parallelism of predicate evaluations and various bottlenecks that impact search performance. We present several performance improvement techniques and apply a scan sharing technique based on cache reuse efficiency to further improve the performance. We demonstrate the performance benefits of our scan sharing scheduler over other scheduling approaches in a workload of mixed search queries.

[1]  Anastasia Ailamaki,et al.  QPipe: a simultaneously pipelined relational query engine , 2005, SIGMOD '05.

[2]  Kenneth A. Ross,et al.  Implementing database operations using SIMD instructions , 2002, SIGMOD '02.

[3]  Kenneth A. Ross,et al.  Buffering databse operations for enhanced instruction cache performance , 2004, SIGMOD '04.

[4]  Daniel J. Abadi,et al.  Integrating compression and execution in column-oriented database systems , 2006, SIGMOD Conference.

[5]  David J. DeWitt,et al.  Read-optimized databases, in depth , 2008, Proc. VLDB Endow..

[6]  Babak Falsafi,et al.  To Share or Not To Share? , 2007, VLDB.

[7]  Bishwaranjan Bhattacharjee,et al.  Increasing Buffer-Locality for Multiple Index Based Scans through Intelligent Placement and Index Scan Speed Control , 2007, VLDB.

[8]  Babak Falsafi,et al.  Database Servers on Chip Multiprocessors: Limitations and Opportunities , 2007, CIDR.

[9]  Nicolas Bruno Teaching an Old Elephant New Tricks , 2009, CIDR.

[10]  Carsten Binnig,et al.  Dictionary-based order-preserving string compression for main memory column stores , 2009, SIGMOD Conference.

[11]  Marcin Zukowski,et al.  Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS , 2007, VLDB.

[12]  Frederick Reiss,et al.  Main-memory scan sharing for multi-core CPUs , 2008, Proc. VLDB Endow..

[13]  Marcin Zukowski,et al.  MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[14]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[15]  David J. DeWitt,et al.  DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[16]  Kenneth A. Ross,et al.  Conjunctive selection conditions in main memory , 2002, PODS.