Scal: A Benchmarking Suite for Concurrent Data Structures

Concurrent data structures such as concurrent queues, stacks, and pools are widely used for concurrent programming of shared-memory multiprocessor and multicore machines. The key challenge is to develop data structures that are not only fast on a given machine but whose performance scales, ideally linearly, with the number of threads, cores, and processors on even bigger machines. Part of that challenge is to provide a common ground for systematically evaluating the performance and scalability of new concurrent data structures and comparing the results with the performance and scalability of existing solutions. For this purpose, we have developed Scal which is an open-source benchmarking framework that provides (1) software infrastructure for executing concurrent data structure algorithms, (2) workloads for benchmarking their performance and scalability, and (3) implementations of a large set of concurrent data structures. We discuss the Scal infrastructure, workloads, and implementations, and encourage further use and development of Scal in the design and implementation of ever faster concurrent data structures.

[1]  Christoph M. Kirsch,et al.  How FIFO is your concurrent FIFO queue? , 2012, RACES '12.

[2]  Ana Sokolova,et al.  Quantitative relaxation of concurrent data structures , 2013, POPL.

[3]  D. M. Hutton,et al.  The Art of Multiprocessor Programming , 2008 .

[4]  Maged M. Michael Hazard pointers: safe memory reclamation for lock-free objects , 2004, IEEE Transactions on Parallel and Distributed Systems.

[5]  Donald E. Knuth The art of computer programming: fundamental algorithms , 1969 .

[6]  Yehuda Afek,et al.  Fast concurrent queues for x86 processors , 2013, PPoPP '13.

[7]  Nir Shavit,et al.  A scalable lock-free stack algorithm , 2004, SPAA '04.

[8]  Nir Shavit Data structures in the multicore age , 2011, CACM.

[9]  Christoph M. Kirsch,et al.  Fast Concurrent Data-Structures Through Explicit Timestamping , 2014 .

[10]  Ana Sokolova,et al.  Performance, Scalability, and Semantics of Concurrent FIFO Queues , 2012, ICA3PP.

[11]  Nir Shavit,et al.  Flat combining and the synchronization-parallelism tradeoff , 2010, SPAA '10.

[12]  Yehuda Afek,et al.  Quasi-Linearizability: Relaxed Consistency for Improved Concurrency , 2010, OPODIS.

[13]  Ana Sokolova,et al.  Local Linearizability , 2015, ArXiv.

[14]  Rachid Guerraoui,et al.  Laws of order , 2011, POPL 2011.

[15]  Christoph M. Kirsch,et al.  A Scalable, Correct Time-Stamped Stack , 2015, POPL.

[16]  Vincent Gramoli,et al.  More than you ever wanted to know about synchronization: synchrobench, measuring the impact of the synchronization on concurrent algorithms , 2015, PPoPP.

[17]  Maged M. Michael,et al.  Idempotent work stealing , 2009, PPoPP '09.

[18]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[19]  Maged M. Michael,et al.  Simple, fast, and practical non-blocking and blocking concurrent queue algorithms , 1996, PODC '96.

[20]  Martin C. Rinard,et al.  Detecting and eliminating memory leaks using cyclic memory allocation , 2007, ISMM '07.

[21]  Erez Petrank,et al.  A methodology for creating fast wait-free data structures , 2012, PPoPP '12.

[22]  Rachid Guerraoui,et al.  Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated , 2011, POPL '11.

[23]  Christoph M. Kirsch,et al.  Fast and Scalable, Lock-Free k-FIFO Queues , 2013, PaCT.

[24]  Maged M. Michael ABA Prevention Using Single-Word Instructions , 2004 .

[25]  Ana Sokolova,et al.  Distributed queues in shared memory: multicore performance and scalability through quantitative relaxation , 2013, CF '13.

[26]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[27]  Panagiota Fatourou,et al.  A highly-efficient wait-free universal construction , 2011, SPAA '11.

[28]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.