Contention resolution on a broadcast-based distributed shared memory multiprocessor

The issue of resolving remote memory access contention on hardware distributed shared memory multiprocessors and the performance impact of implementing a contention resolution algorithm are focused. After summarising a multiprocessor architecture called the simultaneous multiprocessor optical exchange-bus (SOME-Bus), a simple but effective contention resolution algorithm that relies on the information of the number of messages in the channel queue reported by each node is presented. The algorithm detects potential hot spots and resolves contention using dynamic page migration protocol, and balances remote memory accesses across the nodes of the system. Simulations with eight parallel codes on a 64-processor SOME-Bus show that the algorithm yields significant performance improvements such as balanced-memory load, reduction in the execution times, number of remote memory accesses, average channel waiting times and average network latencies.

[1]  Gregory F. Pfister,et al.  “Hot spot” contention and combining in multistage interconnection networks , 1985, IEEE Transactions on Computers.

[2]  Dhabaleswar K. Panda,et al.  How much does network contention affect distributed shared memory performance? , 1997, Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162).

[3]  Josep Torrellas,et al.  Cache-Only Memory Architectures , 1999, Computer.

[4]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[5]  Sivarama P. Dandamudi,et al.  Reducing hot-spot contention in shared-memory multiprocessor systems , 1999, IEEE Concurr..

[6]  M. M. Cherian A STUDY OF BACKOFF BARRIER SYNCHRONIZATION , 1989 .

[7]  Dimitrios S. Nikolopoulos Quantifying contention and balancing memory load on hardware DSM multiprocessors , 2003, J. Parallel Distributed Comput..

[8]  Eduard Ayguadé,et al.  User-level dynamic page migration for multiprogrammed shared-memory multiprocessors , 2000, Proceedings 2000 International Conference on Parallel Processing.

[9]  Anoop Gupta,et al.  Operating system support for improving data locality on CC-NUMA compute servers , 1996, ASPLOS VII.

[10]  Constantine Katsinis,et al.  Performance analysis of the simultaneous optical multi-processor exchange bus , 2001, Parallel Comput..

[11]  Dimitrios S. Nikolopoulos Quantifying and resolving remote memory access contention on hardware DSM multiprocessors , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[12]  Kang G. Shin,et al.  Prevention of Congestion in Packet-Switched Multistage Interconnection Networks , 1995, IEEE Trans. Parallel Distributed Syst..

[13]  Eyal de Lara,et al.  The Effect of Contention on the Scalability of Page-Based Software Shared Memory Systems , 1999 .

[14]  Alan L. Cox,et al.  Contention elimination by replication of sequential sections in distributed shared memory programs , 2001, PPoPP '01.

[15]  Nian-Feng Tzeng,et al.  Distributing Hot-Spot Addressing in Large-Scale Multiprocessors , 1987, IEEE Transactions on Computers.

[16]  D. Magdic Limes: a multiprocessor simulation environment for PC platforms , 1997, 1997 21st International Conference on Microelectronics. Proceedings.

[17]  Eyal de Lara,et al.  The Efeect of Contention on the Scalability of Page-Based Software Shared Memory Systems , 2000, LCR.

[18]  Gyungho Lee,et al.  On the Effectiveness of Combining in Resolving "Hot Spot" Contention , 1994, J. Parallel Distributed Comput..

[19]  Constantine Katsinis,et al.  Fault-tolerant distributed shared memory on a broadcast-based architecture , 2004, IEEE Transactions on Parallel and Distributed Systems.