Transactional Memory: Architectural Support For Lock-free Data Structures

A shared data structure is lock-free if its operations do not require mutual exclusion. If one process is interrupted in the middle of an operation, other processes will not be prevented from operating on that object. In highly concurrent systems, lock-free data structures avoid common problems associated with conventional locking techniques, including priority inversion, convoying, and difficulty of avoiding deadlock. This paper introduces transactional memory, a new multiprocessor architecture intended to make lock-free synchronization as efficient (and easy to use) as conventional techniques based on mutual exclusion. Transactional memory allows programmers to define customized read-modify-write operations that apply to multiple, independently-chosen words of memory. It is implemented by straightforward extensions to any multiprocessor cache-coherence protocol. Simulation results show that transactional memory matches or outperforms the best known locking techniques for simple benchmarks, even in the absence of priority inversion, convoying, and deadlock.

[1]  Robert Metcalfe,et al.  Ethernet: distributed packet switching for local computer networks , 1988, CACM.

[2]  Jim Gray,et al.  Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.

[3]  Leslie Lamport,et al.  How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[4]  J. Goodman Using cache memory to reduce processor-memory traffic , 1983, ISCA '83.

[5]  Larry Rudolph,et al.  Dynamic decentralized cache schemes for mimd parallel processors , 1984, ISCA '84.

[6]  Larry Rudolph,et al.  Dynamic decentralized cache schemes for mimd parallel processors , 1984, ISCA 1984.

[7]  Thomas F. Knight An architecture for mostly functional languages , 1986, LFP '86.

[8]  Proceedings of the 16th annual international symposium on Computer architecture , 1986 .

[9]  Gerry Kane,et al.  MIPS RISC Architecture , 1987 .

[10]  Albert Chang,et al.  801 storage: architecture and programming , 1988, TOCS.

[11]  Mary K. Vernon,et al.  Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS 1989.

[12]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and pre , 1990, ISCA 1990.

[13]  Shreekant S. Thakkar,et al.  Synchronization algorithms for shared-memory multiprocessors , 1990, Computer.

[14]  Michel Dubois,et al.  Memory Access Dependencies in Shared-Memory Multiprocessors , 1990, IEEE Trans. Software Eng..

[15]  Thomas E. Anderson,et al.  The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[16]  Jeannette M. Wing,et al.  A Library of Concurrent Objects and Their Proofs of Correctness , 1990 .

[17]  Anoop Gupta,et al.  Memory consistency and event ordering in scalable shared-memory multiprocessors , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[18]  Maurice Herlihy,et al.  A methodology for implementing highly concurrent data structures , 1990, PPOPP '90.

[19]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[20]  David Chaiken,et al.  CACHE COHERENCE PROTOCOLS FOR LARGE-SCALE MULTIPROCESSORS , 1990 .

[21]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[22]  Anant Agarwal,et al.  LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.

[23]  Anant Agarwal,et al.  LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.

[24]  Kourosh Gharachorloo,et al.  Detecting violations of sequential consistency , 1991, SPAA '91.

[25]  Anoop Gupta,et al.  Performance evaluation of memory consistency models for shared-memory multiprocessors , 1991, ASPLOS IV.

[26]  Donald Yeung,et al.  THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR , 1991 .

[27]  James R. Goodman,et al.  Cache Consistency and Sequential Consistency , 1991 .

[28]  B. Bershad Practical considerations for lock-free concurrent objects , 1991 .

[29]  Calton Pu,et al.  A Lock-Free Multiprocessor OS Kernel , 1992, OPSR.

[30]  Eric A. Brewer,et al.  PROTEUS: a high-performance parallel-architecture simulator , 1992, SIGMETRICS '92/PERFORMANCE '92.

[31]  Gurindar S. Sohi,et al.  The expandable split window paradigm for exploiting fine-grain parallelsim , 1992, ISCA '92.

[32]  PROTEUS: A High-Performance Parallel-Architecture Simulator , 1992, SIGMETRICS.

[33]  Gurindar S. Sohi,et al.  The Expandable Split Window Paradigm for Exploiting Fine-grain Parallelism , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[34]  Edward W. Felten,et al.  Performance issues in non-blocking synchronization on shared-memory multiprocessors , 1992, PODC '92.

[35]  Jeannette M. Wing,et al.  Testing and Verifying Concurrent Objects , 1993, J. Parallel Distributed Comput..

[36]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[37]  Vasco Thudichum Vasconcelos,et al.  Typed Concurrent Objects , 1994, ECOOP.

[38]  Richard L. Sites,et al.  Alpha Architecture Reference Manual , 1995 .