Diiracting Trees

Shared counters are among the most basic coordination structures in multiprocessor computation , with applications ranging from barrier synchronization to concurrent-data-structure design. This paper introduces diiracting trees, novel distributed-parallel structures for shared counting and load balancing. Diiracting trees combine a randomized coordination method together with a combinatorial data structure, to yield a logarithmic depth counter that improves on the log 2 depth of counting networks, and overcomes the resiliency drawbacks of combining trees. Empirical evidence , collected on a simulated distributed shared-memory machine and several simulated message passing architectures, shows that diiracting trees scale better and are more robust then both combining trees and counting networks, currently the most eeective known methods for implementing concurrent counters. Diiracting trees have already been used to implement highly eecient pro-ducer/consumer queues, and we believe diiraction will prove to be an eeective alternative paradigm to combining and queue-locking in the design of many concurrent data structures.

[1]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[2]  Larry Rudolph,et al.  Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors , 1983, TOPL.

[3]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[4]  Harold S. Stone Database Applications of the FETCH-AND-ADD Instruction , 1984, IEEE Transactions on Computers.

[5]  Udi Manber,et al.  On maintaining dynamic information in a concurrent environment , 1984, STOC '84.

[6]  Gregory F. Pfister,et al.  “Hot spot” contention and combining in multistage interconnection networks , 1985, IEEE Transactions on Computers.

[7]  Dieter Gawlick,et al.  Processing "Hot Spots" in High Performance Systems , 1985, COMPCON.

[8]  Nancy A. Lynch,et al.  Hierarchical correctness proofs for distributed algorithms , 1987, PODC '87.

[9]  Nian-Feng Tzeng,et al.  Distributing Hot-Spot Addressing in Large-Scale Multiprocessors , 1987, IEEE Transactions on Computers.

[10]  Thomas J. LeBlanc,et al.  A software instruction counter , 1989, ASPLOS III.

[11]  Mary K. Vernon,et al.  Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS III.

[12]  Shreekant S. Thakkar,et al.  Synchronization algorithms for shared-memory multiprocessors , 1990, Computer.

[13]  Thomas E. Anderson,et al.  The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[14]  Maurice Herlihy,et al.  A methodology for implementing highly concurrent data structures , 1990, PPOPP '90.

[15]  Eli Upfal,et al.  A simple load balancing scheme for task allocation in parallel machines , 1991, SPAA '91.

[16]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[17]  Allan Gottlieb,et al.  Process coordination with fetch-and-increment , 1991, ASPLOS IV.

[18]  Donald Yeung,et al.  THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR , 1991 .

[19]  Maurice Herlihy,et al.  Counting networks and multi-processor coordination , 1991, STOC '91.

[20]  Maurice Herlihy,et al.  Wait-free synchronization , 1991, TOPL.

[21]  Maurice Herlihy,et al.  Low contention load balancing on large-scale multiprocessors , 1992, SPAA '92.

[22]  Eric A. Brewer,et al.  PROTEUS: a high-performance parallel-architecture simulator , 1992, SIGMETRICS '92/PERFORMANCE '92.

[23]  Hagit Attiya,et al.  Counting networks with arbitrary fan-out , 1992, SODA '92.

[24]  Seif Haridi,et al.  Distributed Algorithms , 1992, Lecture Notes in Computer Science.

[25]  C. Greg Plaxton,et al.  Small-depth counting networks , 1992, STOC '92.

[26]  Reinhard Lüling,et al.  A dynamic distributed load balancing algorithm with provable good performance , 1993, SPAA '93.

[27]  Donald Yeung,et al.  Sparcle: an evolutionary processor design for large-scale multiprocessors , 1993, IEEE Micro.

[28]  Maurice Herlihy,et al.  Contention in shared memory algorithms , 1993, JACM.

[29]  R. Ladner,et al.  Building Counting Networks from Larger Balancers Technical Report # 93-04-09 , 1993 .

[30]  Beng-Hong Lim,et al.  Reactive synchronization algorithms for multiprocessors , 1994, ASPLOS VI.

[31]  Marios Mavronicolas,et al.  A combinatorial treatment of balancing networks , 1994, PODC '94.

[32]  Moti Yung,et al.  Coins, weights and contention in balancing networks , 1994, PODC '94.

[33]  Marios Mavronicolas,et al.  Load balancing networks , 1995, PODC '95.

[34]  Marios Mavronicolas,et al.  A logarithmic depth counting network , 1995, PODC '95.