An Adaptive Technique for Constructing Robust and High-Throughput Shared Objects

Shared counters are the key to solving a variety of coordination problems on multiprocessor machines, such as barrier synchronization and index distribution. It is desired that they, like shared objects in general, be robust, linearizable and scalable. We present the first linearizable and wait-free shared counter algorithm that achieves high throughput without a-priori knowledge about the system's level of asynchrony. Our algorithm can be easily adapted to any other combinable objects as well, such as stacks and queues. In particular, in an N-process execution E, our algorithm achieves high throughput of Ω(N/φE2 log2 φElogN), where φE is E's level of asynchrony. Moreover, our algorithm stands any constant number of faults. If E contains a constant number of faults, then our algorithm still achieves high throughput of Ω(N/φ′E2 log2 φ′E logN), where φ′E bounds the relative speeds of any two processes, at a time that both of them participated in E and none of them failed. Our algorithm can be viewed as an adaptive version of the Bounded-Wait-Combining (BWC) prior art algorithm. BWC receives as an input an argument φ as a (supposed) upper bound of φE, and achieves optimal throughput if φ = φE. However, if the given φ happens to be lower than the actual φE, or much greater than φE, then the throughput of BWC degraded significantly. Moreover, whereas BWC is only lock-free, our algorithm is more robust, since it is wait-free. To achieve high throughput and wait-freedom, we present a method that guarantees (for some common kind of procedures) the procedure's successful termination in a bounded time, regardless of shared memory contention. This method may prove useful by itself, for other problems

[1]  Mark Moir,et al.  Using elimination to implement scalable and lock-free FIFO queues , 2005, SPAA '05.

[2]  Nancy A. Lynch,et al.  Consensus in the presence of partial synchrony , 1988, JACM.

[3]  Mark Moir,et al.  SNZI: scalable NonZero indicators , 2007, PODC '07.

[4]  Danny Dolev,et al.  On the minimal synchronism needed for distributed consensus , 1983, 24th Annual Symposium on Foundations of Computer Science (sfcs 1983).

[5]  David R. Cheriton,et al.  Leases: an efficient fault-tolerant mechanism for distributed file cache consistency , 1989, SOSP '89.

[6]  Maurice Herlihy,et al.  Counting networks , 1994, JACM.

[7]  Nir Shavit,et al.  A scalable lock-free stack algorithm , 2010, J. Parallel Distributed Comput..

[8]  Marina Papatriantafilou,et al.  Self-tuning Reactive Distributed Trees for Counting and Balancing , 2004, OPODIS.

[9]  Maurice Herlihy,et al.  Linearizable counting networks , 1996, Distributed Computing.

[10]  Maurice Herlihy,et al.  Contention in shared memory algorithms , 1997, J. ACM.

[11]  Mary K. Vernon,et al.  Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS III.

[12]  Tushar Deepak Chandra,et al.  A polylog time wait-free construction for closed objects , 1998, PODC '98.

[13]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[14]  Nir Shavit,et al.  Diffracting trees , 1996, TOCS.

[15]  Maurice Herlihy,et al.  Low contention linearizable counting , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[16]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[17]  Ralph Grishman,et al.  The NYU ultracomputer—designing a MIMD, shared-memory parallel machine , 2018, ISCA '98.

[18]  Danny Hendler,et al.  Constructing Shared Objects That Are Both Robust and High-Throughput , 2006, DISC.

[19]  Michael B. Greenwald,et al.  Two-handed emulation: how to build non-blocking implementations of complex data-structures using DCAS , 2002, PODC '02.

[20]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.