Optimistic concurrency with OPTIK

We introduce OPTIK, a new practical design pattern for designing and implementing fast and scalable concurrent data structures. OPTIK relies on the commonly-used technique of version numbers for detecting conflicting concurrent operations. We show how to implement the OPTIK pattern using the novel concept of OPTIK locks. These locks enable the use of version numbers for implementing very efficient optimistic concurrent data structures. Existing state-of-the-art lock-based data structures acquire the lock and then check for conflicts. In contrast, with OPTIK locks, we merge the lock acquisition with the detection of conflicting concurrency in a single atomic step, similarly to lock-free algorithms. We illustrate the power of our OPTIK pattern and its implementation by introducing four new algorithms and by optimizing four state-of-the-art algorithms for linked lists, skip lists, hash tables, and queues. Our results show that concurrent data structures built using OPTIK are more scalable than the state of the art.

[1]  Maged M. Michael,et al.  High performance dynamic lock-free hash tables and list-based sets , 2002, SPAA '02.

[2]  Yi Zhang,et al.  A simple, fast and scalable non-blocking concurrent FIFO queue for shared memory multiprocessor systems , 2001, SPAA '01.

[3]  J. T. Robinson,et al.  On optimistic methods for concurrency control , 1979, TODS.

[4]  William Pugh,et al.  Concurrent maintenance of skip lists , 1990 .

[5]  Tudor David,et al.  Everything you always wanted to know about synchronization but were afraid to ask , 2013, SOSP.

[6]  Maurice Herlihy,et al.  A Lazy Concurrent List-Based Set Algorithm , 2007, Parallel Process. Lett..

[7]  Michael L. Scott,et al.  Software partitioning of hardware transactions , 2015, PPoPP.

[8]  Kunle Olukotun,et al.  A practical concurrent binary search tree , 2010, PPoPP '10.

[9]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[10]  Marcos K. Aguilera,et al.  Sinfonia: a new paradigm for building scalable distributed systems , 2007, SOSP.

[11]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[12]  Miguel Castro,et al.  FaRM: Fast Remote Memory , 2014, NSDI.

[13]  Maged M. Michael,et al.  Simple, fast, and practical non-blocking and blocking concurrent queue algorithms , 1996, PODC '96.

[14]  Changwoo Min,et al.  Scalability in the Clouds!: A Myth or Reality? , 2015, APSys.

[15]  Paul E. McKenney,et al.  READ-COPY UPDATE: USING EXECUTION HISTORY TO SOLVE CONCURRENCY PROBLEMS , 2002 .

[16]  Nir Shavit,et al.  Transactional Locking II , 2006, DISC.

[17]  Keir Fraser,et al.  Practical lock-freedom , 2003 .

[18]  Rachid Guerraoui,et al.  On the correctness of transactional memory , 2008, PPoPP.

[19]  Yehuda Afek,et al.  Fast concurrent queues for x86 processors , 2013, PPoPP '13.

[20]  Amitabha Roy,et al.  A runtime system for software lock elision , 2009, EuroSys '09.

[21]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[22]  Tudor David,et al.  Asynchronized Concurrency: The Secret to Scaling Concurrent Search Data Structures , 2015, ASPLOS.

[23]  GramoliVincent More than you ever wanted to know about synchronization: synchrobench, measuring the impact of the synchronization on concurrent algorithms , 2015 .

[24]  Philippas Tsigas,et al.  Fast and lock-free concurrent priority queues for multi-thread systems , 2005, J. Parallel Distributed Comput..

[25]  Eran Yahav,et al.  Practical concurrent binary search trees via logical ordering , 2014, PPoPP '14.

[26]  Adam Morrison,et al.  Predicate RCU: an RCU for scalable concurrent updates , 2015, PPOPP.

[27]  Neeraj Mittal,et al.  Fast concurrent lock-free binary search trees , 2014, PPoPP.

[28]  Christoph Lameter,et al.  Effective Synchronization on Linux/NUMA Systems , 2005 .

[29]  Timothy L. Harris,et al.  A Pragmatic Implementation of Non-blocking Linked-Lists , 2001, DISC.

[30]  Nir Shavit,et al.  A scalable lock-free stack algorithm , 2004, SPAA '04.

[31]  Faith Ellen,et al.  Non-blocking binary search trees , 2010, PODC.

[32]  Petr Kuznetsov,et al.  Brief Announcement: A Concurrency-Optimal List-Based Set , 2015 .

[33]  Nir Shavit,et al.  Flat combining and the synchronization-parallelism tradeoff , 2010, SPAA '10.

[34]  Maurice Herlihy,et al.  The Art of Multiprocessor Programming, Revised Reprint , 2012 .

[35]  Jonathan Walpole,et al.  Performance of memory reclamation for lockless synchronization , 2007, J. Parallel Distributed Comput..

[36]  Torvald Riegel,et al.  Dynamic performance tuning of word-based software transactional memory , 2008, PPoPP.

[37]  Maurice Herlihy,et al.  A Simple Optimistic Skiplist Algorithm , 2007, SIROCCO.

[38]  Maged M. Michael Hazard pointers: safe memory reclamation for lock-free objects , 2004, IEEE Transactions on Parallel and Distributed Systems.

[39]  Michael F. Spear,et al.  NOrec: streamlining STM by abolishing ownership records , 2010, PPoPP '10.

[40]  Timothy L. Harris,et al.  STM in the small: trading generality for performance in software transactional memory , 2012, EuroSys '12.

[41]  Hagit Attiya,et al.  Concurrent updates with RCU: search tree as an example , 2014, PODC '14.

[42]  Nir Shavit,et al.  Software transactional memory , 1995, PODC '95.

[43]  William N. Scherer,et al.  Preemption Adaptivity in Time-Published Queue-Based Spin Locks , 2005, HiPC.

[44]  James R. Goodman,et al.  Speculative lock elision: enabling highly concurrent multithreaded execution , 2001, MICRO.

[45]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.