Diffracting trees

Shared counters are among the most basic coordination structures in multiprocessor conputation, with applications ranging from barrier synchronization to concurrent-data-structure design. This article introduces diffracting trees, novel data structures for share counting and load balancing in a distributed/parallel environment. Empirical evidence, collected on a simulated distributed shared-memory machine and several simulated message-passing architectures, shows that diffracting trees scale better and are more robust than both combining trees and counting networks, currently the most effective known methods for implementing concurrent counters in software. The use of a randomized coordination method together with a combinatorial data structure overcomes the resiliency drawbacks of combining trees. Our simulations show that to handle the same load, diffracting trees and counting networks should have a similar width w, yet the depth of a diffracting tree is O(log w), whereas counting networks have depth O(log2 w). Diffracting trees have already been used to implement highly efficient producer/consumer queues, and we believe diffraction will prove to be an effective alternative paradigm to combining and queue-locking in the design of many concurrent data structures.

[1]  Maurice Herlihy,et al.  Counting networks and multi-processor coordination , 1991, STOC '91.

[2]  Udi Manber,et al.  On maintaining dynamic information in a concurrent environment , 1984, STOC '84.

[3]  Larry Rudolph,et al.  Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors , 1983, TOPL.

[4]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[5]  Donald Yeung,et al.  Sparcle: an evolutionary processor design for large-scale multiprocessors , 1993, IEEE Micro.

[6]  Beng-Hong Lim,et al.  Reactive synchronization algorithms for multiprocessors , 1994, ASPLOS VI.

[7]  Eli Upfal,et al.  A steady state analysis of diffracting trees (extended abstract) , 1996, SPAA '96.

[8]  Nir Shavit,et al.  Elimination trees and the construction of pools and stacks: preliminary version , 1995, SPAA '95.

[9]  Maurice Herlihy,et al.  Low contention linearizable counting , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[10]  Udi Manber On Maintaining Dynamic Information in a Concurrent Environment , 1986, SIAM J. Comput..

[11]  Nir Shavit,et al.  Elimination Trees and the Construction of Pools and Stacks , 1997, Theory of Computing Systems.

[12]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[13]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[14]  Marios Mavronicolas,et al.  Load balancing networks , 1995, PODC '95.

[15]  Dieter Gawlick,et al.  Processing "Hot Spots" in High Performance Systems , 1985, COMPCON.

[16]  A. Agarwal,et al.  Adaptive backoff synchronization techniques , 1989, ISCA '89.

[17]  Nir Shavit,et al.  Diffracting trees (preliminary version) , 1994, SPAA '94.

[18]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[19]  R. Ladner,et al.  Building Counting Networks from Larger Balancers Technical Report # 93-04-09 , 1993 .

[20]  Hagit Attiya,et al.  Counting networks with arbitrary fan-out , 1992, SODA '92.

[21]  Donald Yeung,et al.  THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR , 1991 .

[22]  Eli Upfal,et al.  A simple load balancing scheme for task allocation in parallel machines , 1991, SPAA '91.

[23]  Shreekant S. Thakkar,et al.  Synchronization algorithms for shared-memory multiprocessors , 1990, Computer.

[24]  Eric A. Brewer,et al.  PROTEUS: a high-performance parallel-architecture simulator , 1992, SIGMETRICS '92/PERFORMANCE '92.

[25]  Allan Gottlieb,et al.  Process coordination with fetch-and-increment , 1991, ASPLOS IV.

[26]  Nancy A. Lynch,et al.  Hierarchical correctness proofs for distributed algorithms , 1987, PODC '87.

[27]  Sandhya Dwarkadas,et al.  Efficient Simulation of Parallel Computer Systems , 1991, Int. J. Comput. Simul..

[28]  S. K. Park,et al.  Random number generators: good ones are hard to find , 1988, CACM.

[29]  Mary K. Vernon,et al.  Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS III.

[30]  Thomas J. LeBlanc,et al.  A software instruction counter , 1989, ASPLOS III.

[31]  Marios Mavronicolas,et al.  A logarithmic depth counting network , 1995, PODC '95.

[32]  David G. Carta,et al.  Two fast implementations of the “minimal standard” random number generator , 1990, CACM.

[33]  Marios Mavronicolas,et al.  A combinatorial treatment of balancing networks , 1994, PODC '94.

[34]  Gerry Kane,et al.  MIPS RISC Architecture , 1987 .

[35]  Robert D. Blumofe,et al.  Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[36]  Nian-Feng Tzeng,et al.  Distributing Hot-Spot Addressing in Large-Scale Multiprocessors , 1987, IEEE Transactions on Computers.

[37]  Maurice Herlihy,et al.  Low contention load balancing on large-scale multiprocessors , 1992, SPAA '92.

[38]  Thomas E. Anderson,et al.  The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[39]  Maurice Herlihy,et al.  Contention in shared memory algorithms , 1993, JACM.

[40]  Reinhard Lüling,et al.  A dynamic distributed load balancing algorithm with provable good performance , 1993, SPAA '93.

[41]  Gregory F. Pfister,et al.  “Hot spot” contention and combining in multistage interconnection networks , 1985, IEEE Transactions on Computers.

[42]  Eli Upfal,et al.  A Steady State Analysis of Diffracting Trees , 1998, Theory of Computing Systems.

[43]  C. Greg Plaxton,et al.  Small-depth counting networks , 1992, STOC '92.

[44]  Harold S. Stone Database Applications of the FETCH-AND-ADD Instruction , 1984, IEEE Transactions on Computers.

[45]  Nancy A. Lynch,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[46]  Moti Yung,et al.  Coins, weights and contention in balancing networks , 1994, PODC '94.