JIT happens: Transactional Graph Processing in Persistent Memory meets Just-In-Time Compilation

Graph databases are used for different applications like analyzing large networks, representing and querying knowledge graphs, and managing master data and complex data structures. Besides graph analytics, the transactional processing of concurrent updates and queries represents a challenging data management task. In this paper, we investigate the usage of persistent memory as a very promising technology for graph processing. We present a novel architecture for transactional processing of queries and updates on a property graph model that exploits and addresses the specific characteristics of persistent memory by hybrid storage andmemorymanagement as well as a just-in-time query compilation approach. Our experimental evaluation on interactive short read and update queryworkloads show that PMem-based systems that are well-designed to exploit PMem characteristics outperform traditional disk-based systems significantly and have only a small overhead compared to DRAM-only systems. Moreover, the evaluation shows that JIT compilation brings performance benefits especially when an adaptive compilation approach is leveraged to hide the overhead of compilation as well as the latency of PMem.

[1]  Marko A. Rodriguez,et al.  The Gremlin graph traversal machine and language (invited talk) , 2015, DBPL.

[2]  Steve Scargall PMDK Internals: Important Algorithms and Data Structures , 2020 .

[3]  Per-Åke Larson,et al.  Easy Lock-Free Indexing in Non-Volatile Memory , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[4]  Ismail Oukid,et al.  FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory , 2016, SIGMOD Conference.

[5]  Viktor Leis,et al.  Adaptive Execution of Compiled Queries , 2018, 2018 IEEE 34th International Conference on Data Engineering (ICDE).

[6]  Amir Shaikhha,et al.  How to Architect a Query Compiler , 2016, SIGMOD Conference.

[7]  Keshav Pingali,et al.  Single machine graph analytics on massive datasets using Intel optane DC persistent memory , 2019, Proc. VLDB Endow..

[8]  Ismail Oukid,et al.  SOFORT: a hybrid SCM-DRAM storage engine for fast data recovery , 2014, DaMoN '14.

[9]  Michael Grossniklaus,et al.  An Algebra and Equivalences to Transform Graph Patterns in Neo4j , 2016, EDBT/ICDT Workshops.

[10]  Julian Dolby,et al.  Building an efficient RDF store over a relational database , 2013, SIGMOD '13.

[11]  Andrew Pavlo,et al.  Write-Behind Logging , 2016, Proc. VLDB Endow..

[12]  Roy H. Campbell,et al.  Consistent and Durable Data Structures for Non-Volatile Byte-Addressable Memory , 2011, FAST.

[13]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[14]  Keshav Pingali,et al.  A lightweight infrastructure for graph analytics , 2013, SOSP.

[15]  Viktor Leis,et al.  Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask , 2018, Proc. VLDB Endow..

[16]  Jignesh M. Patel,et al.  The Case Against Specialized Graph Analytics Engines , 2015, CIDR.

[17]  Andrew Pavlo,et al.  What's Really New with NewSQL? , 2016, SGMD.

[18]  Guy E. Blelloch,et al.  Sage: Parallel Semi-Asymmetric Graph Algorithms for NVRAMs , 2020, Proc. VLDB Endow..

[19]  Lidan Shou,et al.  DPTree: Differential Indexing for Persistent Memory , 2019, Proc. VLDB Endow..

[20]  Kunle Olukotun,et al.  EmptyHeaded: A Relational Engine for Graph Processing , 2015, ACM Trans. Database Syst..

[21]  Jens Teubner,et al.  Efficient generation of machine code for query compilers , 2020, DaMoN.

[22]  Ziqi Wang,et al.  Building a Bw-Tree Takes More Than Just Buzz Words , 2018, SIGMOD Conference.

[23]  Thomas Neumann,et al.  Efficiently Compiling Efficient Query Plans for Modern Hardware , 2011, Proc. VLDB Endow..

[24]  Ismail Oukid,et al.  Enabling low tail latency on multicore key-value stores , 2020, Proc. VLDB Endow..

[25]  Sudipta Sengupta,et al.  The Bw-Tree: A B-tree for new hardware platforms , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[26]  Ismail Oukid,et al.  Evaluating Persistent Memory Range Indexes , 2019, Proc. VLDB Endow..

[27]  Lin Ma,et al.  Self-Driving Database Management Systems , 2017, CIDR.

[28]  Gustavo Alonso,et al.  Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph Queries , 2019, ArXiv.

[29]  Per-Åke Larson,et al.  BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory , 2018, Proc. VLDB Endow..

[30]  Wolfgang Lehner,et al.  The Graph Story of the SAP HANA Database , 2013, BTW.

[31]  Jin Xiong,et al.  HiKV: A Hybrid Index Key-Value Store for DRAM-NVM Memory Systems , 2017, USENIX Annual Technical Conference.

[32]  Viktor Leis,et al.  Persistent Memory I/O Primitives , 2019, DaMoN.

[33]  Wolfgang Lehner,et al.  GRAPHITE: an extensible graph traversal framework for relational database management systems , 2014, SSDBM.

[34]  Samuel Madden,et al.  Voodoo - A Vector Algebra for Portable Database Performance on Modern Hardware , 2016, Proc. VLDB Endow..

[35]  David Broneske,et al.  Selective Caching: A Persistent Memory Approach for Multi-Dimensional Index Structures , 2020, 2020 IEEE 36th International Conference on Data Engineering Workshops (ICDEW).

[36]  PavloAndrew,et al.  What's Really New with NewSQL? , 2016 .

[37]  PeriRamesh,et al.  Single machine graph analytics on massive datasets using Intel optane DC persistent memory , 2020, VLDB 2020.

[38]  Hassan Chafi,et al.  The LDBC Social Network Benchmark: Interactive Workload , 2015, SIGMOD Conference.

[39]  Qin Jin,et al.  Persistent B+-Trees in Non-Volatile Main Memory , 2015, Proc. VLDB Endow..

[40]  Andrea C. Arpaci-Dusseau,et al.  Redesigning LSMs for Nonvolatile Memory with NoveLSM , 2018, USENIX Annual Technical Conference.

[41]  Jignesh M. Patel,et al.  High-Performance Concurrency Control Mechanisms for Main-Memory Databases , 2011, Proc. VLDB Endow..

[42]  Andrew Pavlo,et al.  An Empirical Evaluation of In-Memory Multi-Version Concurrency Control , 2017, Proc. VLDB Endow..

[43]  Steven Swanson,et al.  An Empirical Guide to the Behavior and Use of Scalable Persistent Memory , 2019, FAST.

[44]  Pradeep Dubey,et al.  Navigating the maze of graph analytics frameworks using massive graph datasets , 2014, SIGMOD Conference.

[45]  Viktor Leis,et al.  Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age , 2014, SIGMOD Conference.