OpLog: a library for scaling update-heavy data structures

Existing techniques (e.g., RCU) can achieve good multicore scaling for read-mostly data, but for update-heavy data structures only special-purpose techniques exist. This paper presents OpLog, a general-purpose library supporting good scalability for update-heavy data structures. OpLog achieves scalability by logging each update in a low-contention per-core log; it combines logs only when required by a read to the data structure. OpLog achieves generality by logging operations without having to understand them, to ease application to existing data structures. OpLog can further increase performance if the programmer indicates which operations can be combined in the logs. An evaluation shows how to apply OpLog to three update-heavy Linux kernel data structures. Measurements on a 48-core AMD server show that the result significantly improves the performance of the Apache web server and the Exim mail server under certain workloads.

[1]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[2]  Timothy L. Harris,et al.  A Pragmatic Implementation of Non-blocking Linked-Lists , 2001, DISC.

[3]  Eddie Kohler,et al.  Speedy transactions in multicore in-memory databases , 2013, SOSP.

[4]  Nir Shavit,et al.  Flat combining and the synchronization-parallelism tradeoff , 2010, SPAA '10.

[5]  Robert Morris,et al.  Non-scalable locks are dangerous , 2012 .

[6]  David G. Andersen,et al.  There is more consensus in Egalitarian parliaments , 2013, SOSP.

[7]  Austin T. Clements,et al.  The scalable commutativity rule: designing scalable software for multicore processors , 2013, SOSP.

[8]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[9]  Kevin M. Lepak,et al.  Cache Hierarchy and Memory Subsystem of the AMD Opteron Processor , 2010, IEEE Micro.

[10]  Michael Stumm,et al.  Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system , 1999, OSDI '99.

[11]  John R. Vig,et al.  Quartz Crystal Resonators and Oscillators for Frequency Control and Timing Applications , 1993 .

[12]  Dilma Da Silva,et al.  Experience distributing objects in an SMMP OS , 2007, TOCS.

[13]  Robert Tappan Morris,et al.  An Analysis of Linux Scalability to Many Cores , 2010, OSDI.

[14]  Tao Zou,et al.  Tango: distributed data structures over a shared log , 2013, SOSP.

[15]  Tudor David,et al.  Everything you always wanted to know about synchronization but were afraid to ask , 2013, SOSP.

[16]  Henry G. Baker,et al.  Minimizing reference count updating with deferred and anchored pointers for functional data structures , 1994, SIGP.

[17]  Mark Moir,et al.  SNZI: scalable NonZero indicators , 2007, PODC '07.

[18]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[19]  M. Frans Kaashoek,et al.  RadixVM: scalable address spaces for multithreaded applications , 2013, EuroSys '13.

[20]  Jonathan Walpole,et al.  Exploiting deferred destruction: an analysis of read-copy-update techniques in operating system kernels , 2004 .

[21]  Anant Agarwal,et al.  Factored operating systems (fos): the case for a scalable operating system for multicores , 2009, OPSR.

[22]  Dilma Da Silva,et al.  System Support for Online Reconfiguration , 2003, USENIX Annual Technical Conference, General Track.

[24]  Wei Shao-rong,et al.  On Object-based Reverse Mapping , 2013 .

[25]  Brian F. Cooper Spanner: Google's globally-distributed database , 2013, SYSTOR '13.