Cooking DBMS Operations using Granular Primitives

The increasing heterogeneity of the underlying hardware forces modern database system engineers to implement multiple variants of a single database operator (e.g., join, selection). With increasing heterogeneity, these variants become too complex to maintain and tune for different devices. To overcome these disadvantages, developers use an alternative, primitive-based operator design. This design paradigm splits the database operators into granular functions or primitives and executes a given operator by combing the necessary primitives. Hence, we require only a limited set of these primitives as we reuse them for multiple database operations. Thus, tuning a single primitive improves efficiency of all the database operations using it.In this survey, we provide an overview of a primitive-based database engine. First, we list different primitives from literature and place them in a hierarchy from the finest granular level to a complete database operator. Second, for each of primitive we list its possible tuning opportunities. Finally, we discuss the significance of primitive-based execution on the query engine. Overall, this survey aims to serve as a general reference for implementing a primitive-based query engine and possible strategies to tune it for specific processors.

[1]  Nick Koudas,et al.  The design of a query monitoring system , 2009, TODS.

[2]  Sudhakar Yalamanchili,et al.  Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[3]  Kenneth A. Ross Efficient Hash Probes on Modern Processors , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[4]  Bingsheng He,et al.  Relational query coprocessing on graphics processors , 2009, TODS.

[5]  Steffen Zeuch,et al.  Adapting Tree Structures for Processing with SIMD Instructions , 2014, EDBT.

[6]  Jens Dittrich,et al.  On the Surprising Difficulty of Simple Things: the Case of Radix Partitioning , 2015, Proc. VLDB Endow..

[7]  Volker Markl,et al.  The Operator Variant Selection Problem on Heterogeneous Hardware , 2015, ADMS@VLDB.

[8]  Gunter Saake,et al.  Toward Hardware-Sensitive Database Operations , 2014, EDBT.

[9]  Donald E. Knuth The art of computer programming: fundamental algorithms , 1969 .

[10]  Gunter Saake,et al.  Hardware-Sensitive Scan Operator Variants for Compiled Selection Pipelines , 2017, BTW.

[11]  Jin Wang,et al.  Relational Algebra Algorithms and Data Structures for GPU , 2012 .

[12]  Kenneth A. Ross,et al.  A comprehensive study of main-memory partitioning and its application to large-scale comparison- and radix-sort , 2014, SIGMOD Conference.

[13]  Martin L. Kersten,et al.  MIL primitives for querying a fragmented world , 1999, The VLDB Journal.

[14]  Alexander Zeier,et al.  SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units , 2009, Proc. VLDB Endow..

[15]  Dinesh Manocha,et al.  A Cache-Efficient Sorting Algorithm for Database and Data Mining Computations using Graphics Processors , 2016 .

[16]  Stratos Idreos,et al.  One Loop Does Not Fit All , 2015, SIGMOD Conference.

[17]  Samuel Madden,et al.  Voodoo - A Vector Algebra for Portable Database Performance on Modern Hardware , 2016, Proc. VLDB Endow..

[18]  Gunter Saake,et al.  Accelerating Multi-Column Selection Predicates in Main-Memory - The Elf Approach , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[19]  Thomas Neumann,et al.  Efficiently Compiling Efficient Query Plans for Modern Hardware , 2011, Proc. VLDB Endow..

[20]  Volker Markl,et al.  Hardware-Oblivious Parallelism for In-Memory Column-Stores , 2013, Proc. VLDB Endow..

[21]  Guy E. Blelloch,et al.  Vector Models for Data-Parallel Computing , 1990 .

[22]  Kenneth A. Ross,et al.  Optimizing select conditions on GPUs , 2013, DaMoN '13.

[23]  Alfons Kemper,et al.  Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems , 2012, Proc. VLDB Endow..

[24]  Bingsheng He,et al.  Relational joins on graphics processors , 2008, SIGMOD Conference.

[25]  Kenneth A. Ross,et al.  Rethinking SIMD Vectorization for In-Memory Databases , 2015, SIGMOD Conference.

[26]  Kenneth A. Ross,et al.  Making B+- trees cache conscious in main memory , 2000, SIGMOD '00.

[27]  Ismail Oukid,et al.  Vectorizing Database Column Scans with Complex Predicates , 2013, ADMS@VLDB.

[28]  Gustavo Alonso,et al.  doppioDB: A hardware accelerated database , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).

[29]  Tilmann Rabl,et al.  Generating custom code for efficient query execution on heterogeneous processors , 2017, The VLDB Journal.

[30]  Steffen Zeuch,et al.  Selection on Modern CPUs , 2015, IMDM '15.

[31]  Sebastian Breß The Design and Implementation of CoGaDB: A Column-oriented GPU-accelerated DBMS , 2014, Datenbank-Spektrum.

[32]  David J. DeWitt,et al.  Materialization Strategies in a Column-Oriented DBMS , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[33]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[34]  Kai-Uwe Sattler,et al.  Multi-level Parallel Query Execution Framework for CPU and GPU , 2013, ADBIS.

[35]  Guy E. Blelloch,et al.  Prefix sums and their applications , 1990 .

[36]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[37]  Kenneth A. Ross,et al.  Selection conditions in main memory , 2004, TODS.

[38]  Jens Dittrich,et al.  A Seven-Dimensional Analysis of Hashing Methods and its Implications on Query Processing , 2015, Proc. VLDB Endow..

[39]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[40]  Kenneth A. Ross,et al.  Efficient Lightweight Compression Alongside Fast Scans , 2015, DaMoN.

[41]  Pradeep Dubey,et al.  FAST: fast architecture sensitive tree search on modern CPUs and GPUs , 2010, SIGMOD Conference.

[42]  Yao Zhang,et al.  Scan primitives for GPU computing , 2007, GH '07.