ScaleDB: A Scalable, Asynchronous In-Memory Database

ScaleDB is a serializable in-memory transactional database that achieves excellent scalability on multi-core machines by asynchronously updating range indexes. We find that asynchronous range index updates can significantly improve database scalability by applying updates in batches, reducing contention on critical sections. To avoid stale reads, ScaleDB uses small hash indexlets to hold delayed updates. We use in-dexlets to design ACC, an asynchronous concurrency control protocol providing serializability. With ACC, it is possible to delay range index updates without adverse performance effects on transaction execution in the common case. ACC delivers scalable serializable isolation for transactions, with high throughput and low abort rate. Evaluation on a dual-socket server with 36 cores shows that ScaleDB achieves 9.5 × better query throughput than Peloton on the YCSB benchmark and 1.8 × better transaction throughput than Cicada on the TPC-C benchmark.

[1]  Debendra Das Sharma Compute Express Link®: An open industry-standard interconnect enabling heterogeneous data-centric computing , 2022, 2022 IEEE Symposium on High-Performance Interconnects (HOTI).

[2]  Herman Lee,et al.  MyRocks , 2020 .

[3]  Eddie Kohler,et al.  Opportunities for optimism in contended main-memory multicore transactions , 2020, The VLDB Journal.

[4]  Brett Elliott,et al.  Procella: Unifying serving and analytical data at YouTube , 2019, Proc. VLDB Endow..

[5]  Dennis Shasha,et al.  Deferred Runtime Pipelining for contentious multicore software transactions , 2019, EuroSys.

[6]  Xiao Liu,et al.  Basic Performance Measurements of the Intel Optane DC Persistent Memory Module , 2019, ArXiv.

[7]  Ziqi Wang,et al.  Building a Bw-Tree Takes More Than Just Buzz Words , 2018, SIGMOD Conference.

[8]  Anurag Gupta,et al.  Amazon Aurora: On Avoiding Distributed Consensus for I/Os, Commits, and Membership Changes , 2018, SIGMOD Conference.

[9]  Boris Grot,et al.  Scale-out ccNUMA: exploiting skew with strongly consistent caching , 2018, EuroSys.

[10]  M. Frans Kaashoek,et al.  Scaling a file system to many cores using an operation log , 2017, SOSP.

[11]  Hyeontaek Lim,et al.  Cicada: Dependably Fast Multi-Core In-Memory Transactions , 2017, SIGMOD Conference.

[12]  Andrew Pavlo,et al.  What Are We Doing With Our Lives?: Nobody Cares About Our Concurrency Control Research , 2017, SIGMOD Conference.

[13]  Nicolas Bruno,et al.  Spanner: Becoming a SQL System , 2017, SIGMOD Conference.

[14]  Babak Falsafi,et al.  Fat Caches for Scale-Out Servers , 2017, IEEE Micro.

[15]  Stefan Mangard,et al.  Malware Guard Extension: Using SGX to Conceal Cache Attacks , 2017, DIMVA.

[16]  Viktor Leis,et al.  The ART of practical synchronization , 2016, DaMoN '16.

[17]  Ippokratis Pandis,et al.  ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads , 2016, SIGMOD Conference.

[18]  Srinivas Devadas,et al.  TicToc: Time Traveling Optimistic Concurrency Control , 2016, SIGMOD Conference.

[19]  Haibo Chen,et al.  Scaling Multicore Databases via Constrained Parallel Execution , 2016, SIGMOD Conference.

[20]  Babak Falsafi,et al.  An Analysis of Load Imbalance in Scale-out Data Serving , 2016, SIGMETRICS.

[21]  Samuel Madden,et al.  MacroBase: Prioritizing Attention in Fast Data , 2016, SIGMOD Conference.

[22]  Alfons Kemper,et al.  Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems , 2015, SIGMOD Conference.

[23]  Hideaki Kimura,et al.  FOEDUS: OLTP Engine for a Thousand Cores and NVRAM , 2015, SIGMOD Conference.

[24]  Michael Stonebraker,et al.  Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores , 2014, Proc. VLDB Endow..

[25]  Robbert van Renesse,et al.  Characterizing Load Imbalance in Real-World Networked Caches , 2014, HotNets.

[26]  Silas Boyd-Wickizer,et al.  OpLog: a library for scaling update-heavy data structures , 2014 .

[27]  Carlo Curino,et al.  OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases , 2013, Proc. VLDB Endow..

[28]  Eddie Kohler,et al.  Speedy transactions in multicore in-memory databases , 2013, SOSP.

[29]  Craig Freedman,et al.  Hekaton: SQL server's memory-optimized OLTP engine , 2013, SIGMOD '13.

[30]  Sudipta Sengupta,et al.  The Bw-Tree: A B-tree for new hardware platforms , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[31]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[32]  Carlo Curino,et al.  Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems , 2012, SIGMOD Conference.

[33]  R. Morris,et al.  Cache craftiness for fast multicore key-value storage , 2012, EuroSys '12.

[34]  Jignesh M. Patel,et al.  High-Performance Concurrency Control Mechanisms for Main-Memory Databases , 2011, Proc. VLDB Endow..

[35]  Goetz Graefe,et al.  A survey of B-tree locking techniques , 2010, TODS.

[36]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[37]  Kunle Olukotun,et al.  A practical concurrent binary search tree , 2010, PPoPP '10.

[38]  Maurice Herlihy,et al.  A Simple Optimistic Skiplist Algorithm , 2007, SIROCCO.

[39]  Eric Ruppert,et al.  Lock-free linked lists and skip lists , 2004, PODC '04.

[40]  Kihong Kim,et al.  Cache-Conscious Concurrency Control of Main-Memory Indexes on Shared-Memory Multiprocessor Systems , 2001, VLDB.

[41]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[42]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[43]  C. Mohan,et al.  ARIES/IM: an efficient and high concurrency index management method using write-ahead logging , 1992, SIGMOD '92.

[44]  Philip A. Bernstein,et al.  Categories and Subject Descriptors: H.2.4 [Database Management]: Systems. , 2022 .

[45]  S. B. Yao,et al.  Efficient locking for concurrent operations on B-trees , 1981, TODS.

[46]  J. T. Robinson,et al.  On Optimistic Methods For Concurrency Control , 1979, Fifth International Conference on Very Large Data Bases, 1979..

[47]  Daniel J. Abadi,et al.  Latch-free Synchronization in Database Systems: Silver Bullet or Fool's Gold? , 2017, CIDR.

[48]  Jonathan Walpole,et al.  What is RCU, Fundamentally? , 2007 .

[49]  F. O R M A T I O N G U I D Timekeeping in VMware Virtual Machines , 2004 .

[50]  Barbara Liskov,et al.  Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions , 1999 .

[51]  Donald E. Knuth,et al.  Ordered Hash Tables , 1974, Comput. J..