Processing Declarative Queries through Generating Imperative Code in Managed Runtimes

We present the results of our work on integrating database and programming language runtimes through code generation and extensive just-in-time adaptation. Our techniques deliver significant performance improvements over non-integrated solutions. Our work makes important first steps towards a future where data processing applications will commonly run on machines that can store their datasets entirely in persistent memory, and will be written in a single programming language employing higher-level APIs and language-integrated query.

[1]  Christoph Koch,et al.  Incremental query evaluation in a ring of databases , 2010, PODS.

[2]  Prasan Roy,et al.  Efficient and extensible algorithms for multi query optimization , 1999, SIGMOD '00.

[3]  Alfons Kemper,et al.  HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[4]  Joonwon Lee,et al.  Exploiting Internal Parallelism of Flash-based SSDs , 2010, IEEE Computer Architecture Letters.

[5]  Goetz Graefe,et al.  Fast scans and joins using flash drives , 2008, DaMoN '08.

[6]  Sang-Won Lee,et al.  Advances in flash memory SSD technology for enterprise database applications , 2009, SIGMOD Conference.

[7]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[8]  James Cheney,et al.  Edinburgh Research Explorer A Practical Theory of Language-integrated Query , 2022 .

[9]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[10]  Jianliang Xu,et al.  Optimizing Nonindexed Join Processing in Flash Storage-Based Systems , 2013, IEEE Transactions on Computers.

[11]  Stratis Viglas,et al.  REWIND: Recovery Write-Ahead System for In-Memory Non-Volatile Data-Structures , 2015, Proc. VLDB Endow..

[12]  Xiaofeng Meng,et al.  Database Table Scan and Aggregation by Exploiting Internal Parallelism of SSDs , 2012 .

[13]  Chris Lattner,et al.  LLVM: AN INFRASTRUCTURE FOR MULTI-STAGE OPTIMIZATION , 2000 .

[14]  Gavin M. Bierman,et al.  Code Generation for Efficient Query Processing in Managed Runtimes , 2014, Proc. VLDB Endow..

[15]  Todd C. Mowry,et al.  Improving index performance through prefetching , 2001, SIGMOD '01.

[16]  Rick Greer,et al.  Daytona and the fourth-generation language Cymbal , 1999, SIGMOD '99.

[17]  Viktor Leis,et al.  Compiling Database Queries into Machine Code , 2014, IEEE Data Eng. Bull..

[18]  Gavin M. Bierman,et al.  Lost in translation: formalizing proposed extensions to c# , 2007, OOPSLA.

[19]  Wolfgang Lehner,et al.  QPPT: Query Processing on Prefix Trees , 2013, CIDR.

[20]  Jignesh M. Patel,et al.  Join processing for flash SSDs: remembering past lessons , 2009, DaMoN '09.

[21]  Meng Xiaofeng Sub-Join: Query Optimization Algorithm for Flash-Based Database * , 2010 .

[22]  Sudhanva Gurumurthi,et al.  Phase Change Memory: From Devices to Systems , 2011, Phase Change Memory.

[23]  Anastasia Ailamaki,et al.  Inspector Joins , 2005, VLDB.

[24]  Martin Grund,et al.  CPU and cache efficient management of memory-resident databases , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[25]  Stratis Viglas,et al.  Recycling in pipelined query evaluation , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[26]  Kunle Olukotun,et al.  Transactional memory coherence and consistency , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[27]  Hamid Pirahesh,et al.  Compiled Query Execution Engine using JVM , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[28]  Martin L. Kersten,et al.  Accelerating Foreign-Key Joins using Asymmetric Memory Channels , 2011, ADMS@VLDB.

[29]  David J. DeWitt,et al.  DBMSs on a Modern Processor: Where Does Time Go? , 1999, VLDB.

[30]  David J. DeWitt,et al.  Weaving Relations for Cache Performance , 2001, VLDB.

[31]  Yi Zhang,et al.  Optimizing I/O for Big Array Analytics , 2012, Proc. VLDB Endow..

[32]  Xiaodong Zhang,et al.  Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[33]  Alexander Zeier,et al.  HYRISE - A Main Memory Hybrid Storage Engine , 2010, Proc. VLDB Endow..

[34]  Martin L. Kersten,et al.  Optimizing database architecture for the new bottleneck: memory access , 2000, The VLDB Journal.

[35]  Marcin Zukowski,et al.  MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[36]  Setrag Khoshafian,et al.  A decomposition storage model , 1985, SIGMOD Conference.

[37]  Ramesh C. Agarwal,et al.  Block oriented processing of relational database operations in modern computer architectures , 2001, Proceedings 17th International Conference on Data Engineering.

[38]  Sang-Won Lee,et al.  B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives , 2011, Proc. VLDB Endow..

[39]  Stratis Viglas,et al.  Modeling Multithreaded Query Execution on Chip Multiprocessors , 2010, ADMS@VLDB.

[40]  Volker Markl,et al.  Hardware-Oblivious Parallelism for In-Memory Column-Stores , 2013, Proc. VLDB Endow..

[41]  Timos K. Sellis,et al.  Intelligent caching and indexing techniques for relational database systems , 1988, Inf. Syst..

[42]  Thomas Neumann,et al.  Efficiently Compiling Efficient Query Plans for Modern Hardware , 2011, Proc. VLDB Endow..

[43]  Xiaodong Zhang,et al.  Understanding intrinsic characteristics and system implications of flash memory based solid state drives , 2009, SIGMETRICS '09.

[44]  Marcin Zukowski,et al.  Vectorization vs. compilation in query execution , 2011, DaMoN '11.

[45]  Daniel S. Myers,et al.  On the use of NAND flash memory in high-performance relational databases , 2008 .

[46]  Martin L. Kersten,et al.  An architecture for recycling intermediates in a column-store , 2009, SIGMOD Conference.

[47]  Amir Shaikhha,et al.  DBToaster: higher-order delta processing for dynamic, frequently fresh views , 2012, The VLDB Journal.

[48]  Stratis Viglas,et al.  Generating code for holistic query evaluation , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[49]  Jianliang Xu,et al.  DigestJoin: Expediting Joins on Solid-State Drives , 2010, DASFAA.

[50]  Sang-Won Lee,et al.  Design of flash-based DBMS: an in-page logging approach , 2007, SIGMOD '07.

[51]  Stratis Viglas,et al.  Write-limited sorts and joins for persistent memory , 2014, Proc. VLDB Endow..

[52]  Sheldon J. Finkelstein Common expression analysis in database applications , 1982, SIGMOD '82.

[53]  Xiaofeng Meng,et al.  Scan and Join Optimization by Exploiting Internal Parallelism of Flash-Based Solid State Drives , 2013, WAIM.

[54]  Jianliang Xu,et al.  DigestJoin: Exploiting Fast Random Reads for Flash-Based Joins , 2009, 2009 Tenth International Conference on Mobile Data Management: Systems, Services and Middleware.

[55]  Bruce Jacob,et al.  The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization , 2009, ISCA '09.

[56]  Michael Isard,et al.  Steno: automatic optimization of declarative queries , 2011, PLDI '11.

[57]  Goetz Graefe,et al.  Query evaluation techniques for large databases , 1993, CSUR.

[58]  Rina Panigrahy,et al.  Design Tradeoffs for SSD Performance , 2008, USENIX ATC.

[59]  Jae-Myung Kim,et al.  A case for flash memory ssd in enterprise database applications , 2008, SIGMOD Conference.

[60]  Nick Roussopoulos,et al.  DynaMat: a dynamic view management system for data warehouses , 1999, SIGMOD '99.

[61]  Anastasia Ailamaki,et al.  Improving hash join performance through prefetching , 2004, Proceedings. 20th International Conference on Data Engineering.

[62]  Gavin M. Bierman,et al.  Self-managed collections: Off-heap memory management for scalable query-dominated collections , 2017, EDBT.

[63]  Peter Boncz,et al.  UvA-DARE ( Digital Academic Repository ) Monet ; a next-Generation DBMS Kernel For Query-Intensive Applications , 2007 .

[64]  Goetz Graefe,et al.  Query processing techniques for solid state drives , 2009, SIGMOD Conference.