Distributed queues in shared memory: multicore performance and scalability through quantitative relaxation

A prominent remedy to multicore scalability issues in concurrent data structure implementations is to relax the sequential specification of the data structure. We present distributed queues (DQ), a new family of relaxed concurrent queue implementations. DQs implement relaxed queues with linearizable emptiness check and either configurable or bounded out-of-order behavior or pool behavior. Our experiments show that DQs outperform and outscale in micro- and macrobenchmarks all strict and relaxed queue as well as pool implementations that we considered.

[1]  Yehuda Afek,et al.  Fast and scalable rendezvousing , 2013, Distributed Computing.

[2]  David A. Bader,et al.  A fast, parallel spanning tree algorithm for symmetric multiprocessors , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[3]  Nir Shavit Data structures in the multicore age , 2011, CACM.

[4]  Kathryn S. McKinley,et al.  Hoard: a scalable memory allocator for multithreaded applications , 2000, SIGP.

[5]  Nir Shavit,et al.  Flat combining and the synchronization-parallelism tradeoff , 2010, SPAA '10.

[6]  Nir Shavit,et al.  Scalable Producer-Consumer Pools Based on Elimination-Diffraction Trees , 2010, Euro-Par.

[7]  Christoph M. Kirsch,et al.  Fast and Scalable k-FIFO Queues , 2012 .

[8]  Rachid Guerraoui,et al.  Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated , 2011, POPL '11.

[9]  Marina Papatriantafilou,et al.  A lock-free algorithm for concurrent bags , 2011, SPAA '11.

[10]  Yehuda Afek,et al.  Quasi-Linearizability: Relaxed Consistency for Improved Concurrency , 2010, OPODIS.

[11]  Christoph M. Kirsch,et al.  Incorrect systems: It's not the problem, It's the solution , 2012, DAC Design Automation Conference 2012.

[12]  Robert Colvin,et al.  Formal verification of an array-based nonblocking queue , 2005, 10th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS'05).

[13]  Maged M. Michael,et al.  Simple, fast, and practical non-blocking and blocking concurrent queue algorithms , 1996, PODC '96.

[14]  Ana Sokolova,et al.  Performance, Scalability, and Semantics of Concurrent FIFO Queues , 2012, ICA3PP.

[15]  David Dice,et al.  Brief announcement: multilane - a concurrent blocking multiset , 2011, SPAA '11.

[16]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[17]  Ana Sokolova,et al.  Quantitative relaxation of concurrent data structures , 2013, POPL.

[18]  Erez Petrank,et al.  Wait-free queues with multiple enqueuers and dequeuers , 2011, PPoPP '11.

[19]  A. Sokolova,et al.  Brief Announcement : Scalability versus Semantics of Concurrent FIFO Queues , 2011 .

[20]  B. Mandelbrot FRACTAL ASPECTS OF THE ITERATION OF z →Λz(1‐ z) FOR COMPLEX Λ AND z , 1980 .

[21]  Ana Sokolova,et al.  Scalability versus semantics of concurrent FIFO queues , 2011, PODC '11.