Performance of Point and Range Queries for In-memory Databases Using Radix Trees on GPUs

In in-memory database systems augmented by hardware accelerators, accelerating the index searching operations can greatly increase the runtime performance of database queries. Recently, adaptive radix trees (ART) have been shown to provide very fast index search implementation on the CPU. Here, we focus on an accelerator-based implementation of ART. We present a detailed performance study of our GPU-based adaptive radix tree (GRT) implementation over a variety of key distributions, synthetic benchmarks, and actual keys from music and book data sets. The performance is also compared with other index-searching schemes on the GPU. GRT on modern GPUs achieves some of the highest rates of index searches reported in the literature. For point queries, a throughput of up to 106 million and 130 million lookups per second is achieved for sparse and dense keys, respectively. For range queries, GRT yields 600 million and 1000 million lookups per second for sparse and dense keys, respectively, on a large dataset of 64 million 32-bit keys.

[1]  Kenneth A. Ross,et al.  Making B+- trees cache conscious in main memory , 2000, SIGMOD '00.

[2]  Wolfgang Lehner,et al.  SAP HANA database: data management for modern business applications , 2012, SGMD.

[3]  Jens Dittrich,et al.  Main memory adaptive indexing for multi-core systems , 2014, DaMoN '14.

[4]  Stephen M. Rumble,et al.  Log-structured memory for DRAM-based storage , 2014, FAST.

[5]  Beng Chin Ooi,et al.  Contorting high dimensional data for efficient main memory KNN processing , 2003, SIGMOD '03.

[6]  Sudipta Sengupta,et al.  The Bw-Tree: A B-tree for new hardware platforms , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[7]  Bin Fan,et al.  MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing , 2013, NSDI.

[8]  Pradeep Dubey,et al.  FAST: fast architecture sensitive tree search on modern CPUs and GPUs , 2010, SIGMOD Conference.

[9]  John F. Canny,et al.  Big data analytics with small footprint: squaring the cloud , 2013, KDD.

[10]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[11]  M. Gouy,et al.  WWW-query: an on-line retrieval system for biological sequence banks. , 1996, Biochimie.

[12]  Kenneth A. Ross,et al.  Cache Conscious Indexing for Decision-Support in Main Memory , 1999, VLDB.

[13]  Yuan Yuan,et al.  Mega-KV: A Case for GPUs to Maximize the Throughput of In-Memory Key-Value Stores , 2015, Proc. VLDB Endow..

[14]  Mariana L. Neves,et al.  Applying In-Memory Technology for Automatic Template Filling in the Clinical Domain , 2014, CLEF.

[15]  Craig Freedman,et al.  Hekaton: SQL server's memory-optimized OLTP engine , 2013, SIGMOD '13.

[16]  Hasso Plattner,et al.  Leveraging in-memory technology for interactive analyses of point-of-sales data , 2014, 2014 IEEE 30th International Conference on Data Engineering Workshops.

[17]  Alfons Kemper,et al.  HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[18]  Michael J. Carey,et al.  A Study of Index Structures for a Main Memory Database Management System , 1986, HPTS.

[19]  Meichun Hsu,et al.  GPU-Accelerated Large Scale Analytics , 2009 .

[20]  Beng Chin Ooi,et al.  Main memory indexing: the case for BD-tree , 2004, IEEE Transactions on Knowledge and Data Engineering.

[21]  Viktor Leis,et al.  The adaptive radix tree: ARTful indexing for main-memory databases , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[22]  Feng Chen,et al.  OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups , 2005, Nucleic Acids Res..

[23]  Matthias Steinbrecher,et al.  Real-Time Data Mining with In-Memory Database Technology , 2013 .