Efficient and practical constructions of LL/SC variables

Over the past decade, a pair of synchronization instructions known as LL/SC has emerged as the most suitable set of instructions to be used in the design of lock-free algorithms. However, no existing multiprocessor system supports these instructions in hardware. Instead, most modern multiprocessors support instructions such as CAS or RLL/RSC (e.g. POWER4, MIPS, SPARC, IA-64). This paper presents two efficient algorithms that implement 64-bit LL/SC from 64-bit CAS or RLL/RSC. Our results are summarized as follows.We present a practical algorithm for implementing a 64-bit LL/SC object from 64-bit CAS or RLL/RSC objects. Our result shows, for the first time, a practical way of simulating a 64-bit LL/SC memory word using 64-bit CAS memory words (or 64-bit RLL/RSC memory words), incurring only a small constant space overhead per process and a small constant factor slowdown.Although our first solution performs correctly in any practical system, its theoretical correctness depends on unbounded sequence numbers. We present a bounded algorithm that implements a 64-bit LL/SC object from 64-bit CAS or RLL/RSC objects, and has the same time and space complexities as the first algorithm.This and the previous algorithm improve on existing implementations of LL/SC objects by Anderson and Moir in 1995, and Moir in 1997.

[1]  Mark Moir Laziness pays! Using lazy synchronization mechanisms to improve non-blocking constructions , 2001, Distributed Computing.

[2]  Mark Moir,et al.  Universal Constructions for Large Objects , 1995, IEEE Trans. Parallel Distributed Syst..

[3]  Richard L. Sites,et al.  Alpha Architecture Reference Manual , 1995 .

[4]  Mark Moir,et al.  Transparent Support for Wait-Free Transactions , 1997, WDAG.

[5]  John D. Valois Implementing Lock-Free Queues , 1994 .

[6]  David L. Weaver,et al.  The SPARC Architecture Manual , 2003 .

[7]  David L Weaver,et al.  The SPARC architecture manual : version 9 , 1994 .

[8]  Mark Moir,et al.  Universal constructions for multi-object operations , 1995, PODC '95.

[9]  Amos Israeli,et al.  Disjoint-access-parallel implementations of strong shared memory primitives , 1994, PODC '94.

[10]  Prasad Jayanti f-arrays: implementation and applications , 2002, PODC '02.

[11]  Theodore Johnson,et al.  A Nonblocking Algorithm for Shared Queues Using Compare-and-Swap , 1994, IEEE Trans. Computers.

[12]  Tushar Deepak Chandra,et al.  A polylog time wait-free construction for closed objects , 1998, PODC '98.

[13]  Maged M. Michael,et al.  Simple, fast, and practical non-blocking and blocking concurrent queue algorithms , 1996, PODC '96.

[14]  Prasad Jayanti A Complete and Constant Time Wait-Free Implementation of CAS from LL/SC and Vice Versa , 1998, DISC.

[15]  John D. Valois Lock-free linked lists using compare-and-swap , 1995, PODC '95.

[16]  Yehuda Afek,et al.  Wait-free made fast , 1995, STOC '95.

[17]  Maurice Herlihy,et al.  A methodology for implementing highly concurrent data objects , 1993, TOPL.

[18]  Mark Moir,et al.  Nonblocking k-compare-single-swap , 2003, SPAA '03.

[19]  Nir Shavit,et al.  Software transactional memory , 1995, PODC '95.

[20]  Yi Zhang,et al.  A simple, fast and scalable non-blocking concurrent FIFO queue for shared memory multiprocessor systems , 2001, SPAA '01.

[21]  Greg Barnes,et al.  A method for implementing lock-free shared-data structures , 1993, SPAA '93.

[22]  Mark Moir Practical implementations of non-blocking synchronization primitives , 1997, PODC '97.

[23]  Maurice Herlihy,et al.  A methodology for implementing highly concurrent data structures , 1990, PPOPP '90.

[24]  Prasad Jayanti,et al.  Dartmouth Computer Science Technical Report TR2003-446 Efcient and Practical Constructions of LL/SC Variables Ü , 2003 .

[25]  Prasad Jayanti,et al.  Adaptive and efficient abortable mutual exclusion , 2003, PODC '03.