A highly-efficient wait-free universal construction

We present a new simple wait-free universal construction, called Sim, that uses just a Fetch&Add and an LL/SC object and performs a constant number of shared memory accesses. We have implemented SIM in a real shared-memory machine. In theory terms, our practical version of SIM, called P-SIM, has worse complexity than its theoretical analog; in practice though, we experimentally show that P-SIM outperforms several state-of-the-art lock-based and lock-free techniques, and this given that it is wait-free, i.e., that it satisfies a stronger progress condition than all the algorithms it outperforms. We have used P-SIM to get highly-efficient wait-free implementations of stacks and queues. Our experiments show that our implementations outperform the currently state-of-the-art shared stack and queue implementations which ensure only weaker progress properties than wait-freedom.

[1]  Nian-Feng Tzeng,et al.  Distributing Hot-Spot Addressing in Large-Scale Multiprocessors , 1987, IEEE Transactions on Computers.

[2]  Gadi Taubenfeld Synchronization Algorithms and Concurrent Programming , 2006 .

[3]  Nir Shavit,et al.  Flat combining and the synchronization-parallelism tradeoff , 2010, SPAA '10.

[4]  J. T. Robinson,et al.  Parallel Quicksort Using Fetch-and-Add , 1990, IEEE Trans. Computers.

[5]  Erik Hagersten,et al.  Queue locks on cache coherent multiprocessors , 1994, Proceedings of 8th International Parallel Processing Symposium.

[6]  Dimitrios S. Nikolopoulos,et al.  A quantitative architectural evaluation of synchronization algorithms and disciplines on ccNUMA systems: the case of the SGI Origin2000 , 1999, ICS '99.

[7]  Prasad Jayanti,et al.  A time complexity lower bound for randomized implementations of some shared objects , 1998, PODC '98.

[8]  Faith Ellen,et al.  A universal construction for wait-free transaction friendly data structures , 2010, SPAA '10.

[9]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[10]  Mark Moir,et al.  Universal constructions for multi-object operations , 1995, PODC '95.

[11]  Kevin P. McAuliffe,et al.  The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.

[12]  Nir Shavit,et al.  Predictive log-synchronization , 2006, EuroSys '06.

[13]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[14]  Mark Moir,et al.  Universal Constructions for Large Objects , 1995, IEEE Trans. Parallel Distributed Syst..

[15]  Maged M. Michael,et al.  Simple, fast, and practical non-blocking and blocking concurrent queue algorithms , 1996, PODC '96.

[16]  Michel Raynal,et al.  Help When Needed, But No More: Efficient Read/Write Partial Snapshot , 2009, DISC.

[17]  Panagiota Fatourou,et al.  The RedBlue Adaptive Universal Constructions , 2009, DISC.

[18]  Maurice Herlihy,et al.  A methodology for implementing highly concurrent data objects , 1993, TOPL.

[19]  Pat Conway,et al.  Blade computing with the AMD Opteron™ processor ("magny-cours") , 2009, 2009 IEEE Hot Chips 21 Symposium (HCS).

[20]  Nir Shavit,et al.  Combining Funnels: A Dynamic Approach to Software Combining , 2000, J. Parallel Distributed Comput..

[21]  Nir Shavit,et al.  A scalable lock-free stack algorithm , 2010, J. Parallel Distributed Comput..

[22]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[23]  Yehuda Afek,et al.  Long-lived adaptive collect with applications , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[24]  Traviss. Craig,et al.  Building FIFO and Priority-Queuing Spin Locks from Atomic Swap , 1993 .

[25]  Rachid Guerraoui,et al.  Partial snapshot objects , 2008, SPAA '08.

[26]  Nir Shavit,et al.  Scalable Flat-Combining Based Synchronous Queues , 2010, DISC.

[27]  Kathryn S. McKinley,et al.  Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.

[28]  Yehuda Afek,et al.  Wait-free made fast , 1995, STOC '95.

[29]  Panagiota Fatourou,et al.  Fast Implementations of Shared Objects using Fetch & Add , 2010 .