Hot spot analysis in large scale shared memory multiprocessors

Scalable multiprocessors that support a shared-memory image to application programmers are typically based on physical memory modules that are distributed. Consequently, the access times for a particular processor to various parts of physical memory differ. The authors explore the implications of this nonuniformity in memory access times. In particular, the study the effect of hot-spots in hierarchical large scale NUMA multiprocessors. They have developed an analytical model of access latencies and contention for shared resources in the interconnection network that links the processors and memory modules. The objective is to provide a better understanding of nonuniform memory access times in scalable architectures. They show the extent to which a variable can be shared before it becomes a performance bottleneck, and assess the potential gain from replication of shared data items. They also demonstrate that the backoff value (after a memory request rejection) must be chosen carefully to balance memory access time and network utilization and that memory utilization is improved by allowing memory request buffering.

[1]  A. Agarwal,et al.  Adaptive backoff synchronization techniques , 1989, ISCA '89.

[2]  Thomas H. Dunigan KENDALL SQUARE MULTIPROCESSOR: EARLY EXPERIENCES AND PERFORMANCE , 1992 .

[3]  Kenneth C. Sevcik,et al.  Evaluating memory system performance of a large scale NUMA multiprocessor , 1994, Proceedings of International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[4]  Michael Stumm,et al.  Hector-a hierarchically structured shared memory multiprocessor , 1991, Proceedings of the Twenty-Fourth Annual Hawaii International Conference on System Sciences.

[5]  Anoop Gupta,et al.  The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.

[6]  Michael L. Scott,et al.  Evaluation of Multiprocessor Memory Systems Using Off-Line Optimal Behavior , 1991, J. Parallel Distributed Comput..

[7]  Carla Schlatter Ellis,et al.  An analysis of dynamic page placement on a NUMA multiprocessor , 1992, SIGMETRICS '92/PERFORMANCE '92.

[8]  Kenneth C. Sevcik,et al.  Performance Benefits and Limitations of Large NUMA Multiprocessors , 1994, Perform. Evaluation.

[9]  Michael Stumm,et al.  Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors , 1994, IEEE Trans. Computers.

[10]  Robert J. Fowler,et al.  NUMA policies and their relation to memory architecture , 1991, ASPLOS IV.

[11]  K. Harzallah,et al.  Hot spot analysis in large scale shared memory multiprocessors , 1993, Supercomputing '93.