Efficiently Implementing a Large Number of LL/SC Objects

Over the past decade, a pair of instructions called load-linked (LL) and store-conditional (SC) have emerged as the most suitable synchronization instructions for the design of lock-free algorithms. However, current architectures do not support these instructions; instead, they support either CAS (e.g., UltraSPARC, Itanium) or restricted versions of LL/SC (e.g., POWER4, MIPS, Alpha). Thus, there is a gap between what algorithm designers want (namely, LL/SC) and what multiprocessors actually support (namely, CAS or RLL/RSC). To bridge this gap, a flurry of algorithms that implement LL/SC from CAS have appeared in the literature. The two most recent algorithms are due to Doherty, Herlihy, Luchangco, and Moir (2004) and Michael (2004). To implement M LL/SC objects shared by N processes, Doherty et al.'s algorithm uses only O(N + M) space, but is only non-blocking and not wait-free. Michael's algorithm, on the other hand, is wait-free, but uses O(N2 + M) space. The main drawback of his algorithm is the time complexity of the SC operation: although the expected amortized running time of SC is only O(1), the worst-case running time of SC is O(N2). The algorithm in this paper overcomes this drawback. Specifically, we design a wait-free algorithm that achieves a space complexity of O(N2 + M), while still maintaining the O(1) worst-case running time for LL and SC operations.

[1]  Richard L. Sites,et al.  Alpha Architecture Reference Manual , 1995 .

[2]  Maurice Herlihy,et al.  Bringing practical lock-free synchronization to 64-bit applications , 2004, PODC '04.

[3]  Mark Moir Practical implementations of non-blocking synchronization primitives , 1997, PODC '97.

[4]  Maurice Herlihy,et al.  A methodology for implementing highly concurrent data structures , 1990, PPOPP '90.

[5]  Prasad Jayanti An optimal multi-writer snapshot algorithm , 2005, STOC '05.

[6]  Prasad Jayanti,et al.  Efficient Wait-Free Implementation of Multiword LL/SC Variables , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[7]  Maurice Herlihy,et al.  Obstruction-free synchronization: double-ended queues as an example , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[8]  Gary L. Peterson A New Solution to Lamport's Concurrent Programming Problem Using Small Shared Variables , 1983, TOPL.

[9]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[10]  Amos Israeli,et al.  Disjoint-access-parallel implementations of strong shared memory primitives , 1994, PODC '94.

[11]  Mark Moir,et al.  Universal Constructions for Large Objects , 1995, IEEE Trans. Parallel Distributed Syst..

[12]  Maurice Herlihy,et al.  A methodology for implementing highly concurrent data objects , 1993, TOPL.

[13]  Mark Moir,et al.  Nonblocking k-compare-single-swap , 2003, SPAA '03.

[14]  Prasad Jayanti,et al.  Efficient and practical constructions of LL/SC variables , 2003, PODC '03.

[15]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[16]  Gary L. Peterson,et al.  Concurrent Reading While Writing , 1983, TOPL.

[17]  Prasad Jayanti f-arrays: implementation and applications , 2002, PODC '02.

[18]  Tushar Deepak Chandra,et al.  A polylog time wait-free construction for closed objects , 1998, PODC '98.

[19]  Leslie Lamport,et al.  Concurrent reading and writing , 1977, Commun. ACM.

[20]  Maged M. Michael Practical Lock-Free and Wait-Free LL/SC/VL Implementations Using 64-Bit CAS , 2004, DISC.

[21]  Mark Moir,et al.  Universal constructions for multi-object operations , 1995, PODC '95.