Counting networks and multi-processor coordination

Many fundamental multi-processor coordination problems can be expressed aa counting problems: processes must, cooperate to assign successive values from a given range, such as addresses in memory or destinations on an interconnection network. Conventional solutions to these problems perform poorly because of synchronization bottlenecks and high memory contention. Motivated by observations on the behavior of sorting networks, we offer a completely new approach to solving such problems. We introduce a new class of networks called counting networks, i.e., networks that can be used to count. We give a counting network construction of depth Iogz n using n log2 n “gates, ” Based on this construction, we provide coordination algorithms that avoid the sequential bottlenecks inherent to former solutions, and have subst ant i all y lower contention. Finally, to show that counting networks are *Carnegie Mellon University. t D&taf Equipment Corporation, Cambridge Research Lab. i MIT Lab. for Computer Science. Supported by ONR contract NOOO14-91-J-1O46, NSF grant CCR-S915206, DARPA contract NOO014-89-J-198S, and by a Rothschild postdoctoral fellowship. A large part of this work was performed while the author was at IBM’s Almaden Research Center. not merely mathematical creatures, we provide experimental evidence that they outperform conventional synchronization techniques under a variety of circumstances.

[1]  Gregory F. Pfister,et al.  “Hot spot” contention and combining in multistage interconnection networks , 1985, IEEE Transactions on Computers.

[2]  Ralph Grishman,et al.  The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.

[3]  Eli Upfal,et al.  The Token Distribution Problem (Preliminary Version) , 1986, FOCS.

[4]  János Komlós,et al.  An 0(n log n) sorting network , 1983, STOC.

[5]  Mary K. Vernon,et al.  Efficient synchronization primitives for large-scale cache-coherent multiprocessors , 1989, ASPLOS 1989.

[6]  A. Agarwal,et al.  Adaptive backoff synchronization techniques , 1989, ISCA '89.

[7]  Michael L. Scott,et al.  Algorithms for scalable synchronization on shared-memory multiprocessors , 1991, TOCS.

[8]  G. S. Graham A New Solution of Dijkstra ' s Concurrent Programming Problem , 2022 .

[9]  Larry Rudolph,et al.  Efficient synchronization of multiprocessors with shared memory , 1986, PODC '86.

[10]  Uzi Vishkin,et al.  A Parallel-Design Distributed-Implementation (PDDI) General-Purpose Computer , 2011, Theor. Comput. Sci..

[11]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[12]  Robert H. Halstead,et al.  Mul-T: a high-performance parallel Lisp , 1989, PLDI '89.

[13]  Larry Rudolph,et al.  Dynamic decentralized cache schemes for mimd parallel processors , 1984, ISCA 1984.

[14]  Allan Gottlieb,et al.  Process coordination with fetch-and-increment , 1991, ASPLOS IV.

[15]  Larry Rudolph,et al.  Efficient synchronization of multiprocessors with shared memory , 1988, TOPL.

[16]  Kenneth E. Batcher,et al.  Sorting networks and their applications , 1968, AFIPS Spring Joint Computing Conference.

[17]  Dieter Gawlick,et al.  Processing "Hot Spots" in High Performance Systems , 1985, COMPCON.

[18]  Larry Rudolph,et al.  Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors , 1983, TOPL.

[19]  Kevin P. McAuliffe,et al.  The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.

[20]  Thomas E. Anderson,et al.  The Performance Implications of Spin-Waiting Alternatives for Shared-Memory Multiprocessors , 1989, ICPP.