Reactive synchronization algorithms for multiprocessors

Synchronization algorithms that are efficient across a wide range of applications and operating conditions are hard to design because their performance depends on unpredictable run-time factors. The designer of a synchronization algorithm has a choice of protocols to use for implementing the synchronization operation. For example, candidate protocols for locks include test-and-set protocols and queueing protocols. Frequently, the best choice of protocols depends on the level of contention: previous research has shown that test-and-set protocols for locks outperform queueing protocols at low contention, while the opposite is true at high contention. This paper investigates reactive synchronization algorithms that dynamically choose protocols in response to the level of contention. We describe reactive algorithms for spin locks and fetch-and-op that choose among several shared-memory and message-passing protocols. Dynamically choosing protocols presents a challenge: a reactive algorithm needs to select and change protocols efficiently, and has to allow for the possibility that multiple processes may be executing different protocols at the same time. We describe the notion of consensus objects that the reactive algorithms use to preserve correctness in the face of dynamic protocol changes. Experimental measurements demonstrate that reactive algorithms perform close to the best static choice of protocols at all levels of contention. Furthermore, with mixed levels of contention, reactive algorithms outperform passive algorithms with fixed protocols, provided that contention levels do not change too frequently. Measurements of several parallel applications show that reactive algorithms result in modest performance gains for spin locks and significant gains for fetch-and-op.

[1]  Donald Yeung,et al.  Experience with fine-grain synchronization in MIMD machines for preconditioned conjugate gradient , 1993, PPOPP '93.

[2]  Leslie Lamport,et al.  A fast mutual exclusion algorithm , 1987, TOCS.

[3]  A. Agarwal,et al.  Adaptive backoff synchronization techniques , 1989, ISCA '89.

[4]  John K. Ousterhout,et al.  Scheduling Techniques for Concurrent Systems , 1982, ICDCS.

[5]  Shreekant S. Thakkar,et al.  Synchronization algorithms for shared-memory multiprocessors , 1990, Computer.

[6]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[7]  Lyle A. McGeoch,et al.  Competitive algorithms for on-line problems , 1988, STOC '88.

[8]  Donald Yeung,et al.  Low-Cost Support for Fine-Grain Synchronization in Multiprocessors , 1992, Multithreaded Computer Architecture.

[9]  Burton J. Smith Architecture And Applications Of The HEP Multiprocessor Computer System , 1982, Optics & Photonics.

[10]  Keshav Pingali,et al.  I-structures: Data structures for parallel computing , 1986, Graph Reduction.

[11]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[12]  Maurice Herlihy,et al.  Counting networks and multi-processor coordination , 1991, STOC '91.

[13]  W. Daniel Hillis,et al.  The Network Architecture of the Connection Machine CM-5 , 1996, J. Parallel Distributed Comput..

[14]  Thomas E. Anderson,et al.  The Performance Implications of Spin-Waiting Alternatives for Shared-Memory Multiprocessors , 1989, ICPP.

[15]  Michael L. Scott,et al.  Fast, Contention-Free Combining Tree Barriers , 1992 .

[16]  Larry Rudolph,et al.  Efficient synchronization of multiprocessors with shared memory , 1986, PODC '86.

[17]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[18]  Anant Agarwal,et al.  APRIL: a processor architecture for multiprocessing , 1990, ISCA '90.

[19]  Larry Rudolph,et al.  Dynamic decentralized cache schemes for mimd parallel processors , 1984, ISCA '84.

[20]  Anant Agarwal,et al.  Waiting algorithms for synchronization in large-scale multiprocessors , 1993, TOCS.

[21]  Donald Yeung,et al.  THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR , 1991 .

[22]  Anna R. Karlin,et al.  Empirical studies of competitve spinning for a shared-memory multiprocessor , 1991, SOSP '91.

[23]  Karsten Schwan,et al.  Improving performance by use of adaptive objects: experimentation with a configurable multiprocessor thread package , 1993, [1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing.

[24]  James H. Anderson,et al.  Fast, scalable synchronization with minimal hardware support , 1993, PODC '93.

[25]  Maurice Herlihy,et al.  Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[26]  Anna R. Karlin,et al.  Competitive randomized algorithms for non-uniform problems , 1990, SODA '90.

[27]  Robert H. Halstead,et al.  MASA: a multithreaded processor architecture for parallel symbolic computing , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[28]  David Chaiken,et al.  The Alewife CMMU: Addressing the Multiprocessor Communications Gap , 1994 .

[29]  David Chaiken,et al.  Latency Tolerance through Multithreading in Large-Scale Multiprocessors , 1991 .

[30]  Arvind,et al.  M-Structures: Extending a Parallel, Non-strict, Functional Language with State , 1991, FPCA.

[31]  Virgil D. Gligor,et al.  A Comparative Analysis of Multiprocessor Scheduling Algorithms , 1987, ICDCS.

[32]  Michael L. Scott,et al.  Contention-free combining tree barriers , 1994 .

[33]  Dennis Shasha,et al.  Concurrent search structure algorithms , 1988, TODS.

[34]  Paul Hudak,et al.  ORBIT: an optimizing compiler for scheme , 1986, SIGPLAN '86.

[35]  Allan Borodin,et al.  An optimal online algorithm for metrical task systems , 1987, STOC.

[36]  Anoop Gupta,et al.  The Stanford FLASH multiprocessor , 1994, ISCA '94.

[37]  James R. Larus,et al.  Tempest and typhoon: user-level shared memory , 1994, ISCA '94.

[38]  Anant Agarwal,et al.  LimitLESS directories: A scalable cache coherence scheme , 1991, ASPLOS IV.

[39]  Thomas E. Anderson,et al.  The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[40]  Larry Rudolph,et al.  Dynamic decentralized cache schemes for mimd parallel processors , 1984, ISCA 1984.

[41]  Larry Rudolph,et al.  Efficient synchronization of multiprocessors with shared memory , 1988, TOPL.

[42]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[43]  Mary K. Vernon,et al.  Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS III.

[44]  Anant Agarwal,et al.  Anatomy of a message in the Alewife multiprocessor , 1993 .

[45]  David Lorge Parnas,et al.  Concurrent control with “readers” and “writers” , 1971, CACM.

[46]  John K. Ousterhout Scheduling Techniques for Concurrebt Systems. , 1982, ICDCS 1982.

[47]  Robert H. Halstead,et al.  Lazy task creation: a technique for increasing the granularity of parallel programs , 1990, IEEE Trans. Parallel Distributed Syst..

[48]  S. A. Ward,et al.  Compiler analysis to implement point-to-point synchronization in parallel programs , 1993 .

[49]  Anoop Gupta,et al.  The Stanford Dash multiprocessor , 1992, Computer.

[50]  Anant Agarwal,et al.  Anatomy of a message in the Alewife multiprocessor , 1993, ICS '93.

[51]  Nian-Feng Tzeng,et al.  Distributing Hot-Spot Addressing in Large-Scale Multiprocessors , 1987, IEEE Transactions on Computers.

[52]  Patrick J. Burns,et al.  Vectorization on Monte Carlo particle transport: an architectural study using the LANL benchmark “GAMTEB” , 1989, Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89).

[53]  Larry Rudolph,et al.  Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors , 1983, TOPL.

[54]  W. Daniel Hillis,et al.  The network architecture of the Connection Machine CM-5 (extended abstract) , 1992, SPAA '92.

[55]  Ralph Grishman,et al.  The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.

[56]  Manhoi Choy,et al.  Adaptive solutions to the mutual exclusion problem , 1993, PODC '93.