Interproceessor Invocation on a NUMA Multiprocessor

On a distributed shared memory machine, the problem of minimizing accesses to remote memory modules is crucial for obtaining high performance. We describe an object-based, parallel programming system called OSMIUM to support experiments with mechanisms for performing invocations on remote objects. The mechanisms we have studied include: non-cached access to remote memory, data migration, and function-shipping using an interprocessor invocation protocol (IIP). Our analyses and experiments indicate that IIP competes well with the alternatives, especially when the structure of user programs requires synchronized access to data structures. While these results are obtained on a NUMA multiprocessor, they are also applicable to systems that use hardware cache coherency techniques.

[1]  Thomas J. LeBlanc,et al.  A software instruction counter , 1989, ASPLOS 1989.

[2]  Philip J. Woest,et al.  The Wisconsin multicube: a new large-scale cache-coherent multiprocessor , 1988, ISCA '88.

[3]  David L. Black,et al.  Machine-independent virtual memory management for paged uniprocessor and multiprocessor architectures , 1987, ASPLOS 1987.

[4]  Henry M. Levy,et al.  High-performance cross-address space communication , 1990 .

[5]  Jeffrey S. Chase,et al.  The Amber system: parallel programming on a network of multiprocessors , 1989, SOSP '89.

[6]  Anoop Gupta,et al.  The VMP multiprocessor: initial experience, refinements, and performance evaluation , 1988, ISCA '88.

[7]  Partha Dasgupta,et al.  Object memory and storage management in the Clouds kernel , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[8]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[9]  Robert J. Fowler,et al.  The implementation of a coherent memory abstraction on a NUMA multiprocessor: experiences with platinum , 1989, SOSP '89.

[10]  Harold S. Stone,et al.  Footprints in the cache , 1987, TOCS.

[11]  Andrew P. Black,et al.  Fine-grained mobility in the Emerald system , 1987, TOCS.

[12]  B J Smith,et al.  A pipelined, shared resource MIMD computer , 1986 .

[13]  Kai Li,et al.  Shared virtual memory on loosely coupled multiprocessors , 1986 .

[14]  Vivek Sarkar,et al.  Partitioning and Scheduling Parallel Programs for Multiprocessing , 1989 .

[15]  Anoop Gupta,et al.  The directory-based cache coherence protocol for the DASH multiprocessor , 1990, ISCA '90.

[16]  Shahid H. Bokhari,et al.  Assignment Problems in Parallel and Distributed Computing , 1987 .

[17]  Burton J. Smith,et al.  The Horizon supercomputing system: architecture and software , 1988, Proceedings. SUPERCOMPUTING '88.