Evaluation of NUMA Memory Management Through Modeling and Measurements

Dynamic page placement policies for NUMA (nonuniform memory access time) shared-memory architectures are explored using two approaches that complement each other in important ways. The authors measure the performance of parallel programs running on the experimental DUnX operating system kernel for the BBN GP1000, which supports a highly parameterized dynamic page placement policy. They also develop and apply an analytic model of memory system performance of a local/remote NUMA architecture based on approximate mean-value analysis techniques. The model is validated against experimental data obtained with DUnX while running a synthetic workload. The results of this validation show that, in general, model predictions are quite good. Experiments investigating the effectiveness of dynamic page-placement and, in particular, dynamic multiple-copy page placement the cost of replication/coherency fault errors, and the cost of errors in deciding whether a page should move or be remotely referenced are described. >

[1]  Mark A. Holliday,et al.  Page table management in local/remote architectures , 1988, ICS '88.

[2]  Kai Li,et al.  A Hypercube Shared Virtual Memory System , 1989, ICPP.

[3]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1989, TOCS.

[4]  Robert J. Fowler,et al.  The implementation of a coherent memory abstraction on a NUMA multiprocessor: experiences with platinum , 1989, SOSP '89.

[5]  Anoop Gupta,et al.  Competitive management of distributed shared memory , 1989, Digest of Papers. COMPCON Spring 89. Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage.

[6]  Mary K. Vernon,et al.  Performance Analysis of Hierarchical Cache-Consistent Multiprocessors , 1989, Perform. Evaluation.

[7]  Michel Dubois,et al.  Dynamic page migration in multiprocessors with distributed global memory , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[8]  Lionel M. Ni,et al.  Critical factors in NUMA memory management , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[9]  Carla Schlatter Ellis,et al.  OS Experimentation and a User Community Coexist Under the DUnX Kernel , 1991, ICPP.

[10]  Anoop Gupta,et al.  Performance evaluation of memory consistency models for shared-memory multiprocessors , 1991, ASPLOS IV.

[11]  Carla Schlatter Ellis,et al.  The robustness of NUMA memory management , 1991, SOSP '91.

[12]  Michel Dubois,et al.  Memory Access Dependencies in Shared-Memory Multiprocessors , 1990, IEEE Trans. Software Eng..

[13]  Gurindar S. Sohi,et al.  Experience with mean value analysis model for evaluating shared bus, throughput-oriented multiprocessors , 1991, SIGMETRICS '91.

[14]  M. Hill,et al.  Weak ordering-a new definition , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[15]  Carla Schlatter Ellis,et al.  An analysis of dynamic page placement on a NUMA multiprocessor , 1992, SIGMETRICS '92/PERFORMANCE '92.

[16]  David L. Black,et al.  Competitive algorithms for replication and migration problems , 1989 .

[17]  Michael L. Scott,et al.  Simple but effective techniques for NUMA memory management , 1989, SOSP '89.

[18]  Mary K. Vernon,et al.  A Mean-Value Performance Analysis of a New Multiprocessor Architecture , 1988, SIGMETRICS.

[19]  Mark A. Holliday,et al.  Reference history, page size, and migration daemons in local/remote architectures , 1989, ASPLOS III.

[20]  David L. Black,et al.  Scheduling and resource management techniques for multiprocessors , 1990 .

[21]  Carla Schlatter Ellis,et al.  Experimental comparison of memory management policies for NUMA multiprocessors , 1991, TOCS.

[22]  Robert J. Fowler,et al.  NUMA policies and their relation to memory architecture , 1991, ASPLOS IV.

[23]  Josep Torrellas,et al.  Analysis of Critical Architectural and Program Parameters in a Hierarchical Shared Memory Multiprocessor , 1990, SIGMETRICS.

[24]  Jr. Richard Philip Larowe,et al.  Page placement for non-uniform memory access time (NUMA) shared memory multiprocessors , 1991 .

[25]  Mary K. Vernon,et al.  An accurate and efficient performance analysis technique for multiprocessor snooping cache-consistency protocols , 1988, ISCA '88.

[26]  Carla Schlatter Ellis,et al.  Exploiting operating system support for dynamic page placement on a NUMA shared memory multiprocessor , 1991, PPOPP '91.

[27]  Mary K. Vernon,et al.  Comparison of hardware and software cache coherence schemes , 1991, ISCA '91.