Software Combining Algorithms for Distributing Hot-Spot Addressing

Abstract In a large shared-memory multiprocessor system, a large number of simultaneous accesses to a single shared variable (called a hot spot in [10]) can degrade the performance of its shared memory system. Software combining [14] is an inexpensive alternative to the hardware combining networks [3, 9] for tackling this problem. This paper gives software combining algorithms for three different types of hot-spot accesses: (1) barrier synchronizations in parallel loops, (2) fetch-and-add type of operations, and (3) P and V operations on semaphores. They include most of the general hot-spot access patterns. By using software combining trees to distribute hot-spot accessings, the number of processors that can access the same location is greatly reduced. In these algorithms, the completion time of a hot-spot access is in the order O(log2N) in a multiprocessor system with N processors, assuming that the delay of a switch element in an interconnection network is a constant time, O(1).