Contention in counting networks

Implementing counting networks [2] on shared-memory multiprocessor machines often incurs a performance penalty proportional to the extent to which concurrent processors simultaneously access the same memory location. In this work, we continue the study, initiated in [7], of the dependence of performance, as measured by contention [5], on the width of the balancers used in such constructions. Our main results are two new constructions of counting networks of widths p2k and pqk, for any integers p, q > 2 and k ~ O, respectively, and corresponding formal contention analyses. The first construction is an elegant generalization of the classical bitonic network of Batcher [3] to widths of p2k, using 2and p-balancers. This construction significantly improves a previous attempt [1] to constructing counting networks of this width in both size and depth. We provide a sharp contention analysis, based on using recurrence relations and exploiting the recursive structure of the construction. This analysis establishes a tight asymptotic bound k-1, in the presence of n concurwith dominant term nk2/p2 rent processors, and demonstrates that increasing the width of the bitonic counting network [2] by a constant factor p results to a decrease in contention by the same factor. This implies an interesting trade-off between the amount of hardware and the efficiency of software for implementing bitonic counting networks on multi-processor architectures. The second construction, uses p-balancers and q-balancers and achieves width of pqk. It is based on constructing a smoothing network of width pqk, and cascading it with any sorting network. This construction generalizes one of width p2k presented in [1]. We establish corresponding upper bounds on contention for this construction. *Department of Computer Science, University of Crete, and Institute of Computer Science, FORTH, Heraklion 71110, Greece. Email address: Bpous@csi. forth. gr t Department of Computer Science, University of Crete, and Institute of Computer Science, FORTH, Heraklion 71110, Greece. Email address: hardavacsi forth .gr $ Department of Computer Science, University of CYPrust NicOsia, Cyprus. Currently at Institute of Computer Science, Foundation of Research and Technology, Greece. Email address: navronicQcs i. forth. gr Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantaqe, the ACM copyright notice and the title of the publication and Its date appear, and notice is given that copying is by permission of the Association of Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. PODC 948/94 Los Angeles CA USA @ 1994 ACM 0-89791 -654-9/94/0008.$3.50 Both correctness proofs for these constructions are modular and systematic, and consist merely of verifying sufficient conditions for counting networks shown in [4]. We are currently implementing a software simulation of our constructions in a general asynchronous multi-processor machine. We have obtained some initial experimental evidence that, under a variety of circumstances, our constructions outperform in typical contention performance previous ones presented in [2, 6].