An effective selection policy for load balancing in software DSM

Load balance is an area of current research in software distributed shared memory (DSM) systems. When threads are dynamically migrated from heavily loaded nodes to lightly loaded nodes to achieve load balance, the communication cost of maintaining data consistency is increased if migration threads are carelessly selected. Program performance is degraded when loss from increased communication exceeds the benefit from load balancing. Therefore, load balancing requires careful choice of migration threads. This study addresses the problem with a novel selection policy called Reduce Internode Sharing Cost (RISC). The main characteristic of this thread selection policy is simultaneous consideration of both thread memory access types and global sharing. Experimental application of this policy to a DSM system called Cohesion shows that simultaneous consideration of memory access types and global sharing is necessary for thread selection. RISC can reduce 50% data-consistency communication of benchmark applications during execution of the load balance mechanism.

[1]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[2]  John K. Bennett,et al.  Brazos: a third generation DSM system , 1997 .

[3]  Assaf Schuster,et al.  Using Remote Access Histories for Thread Scheduling in Distributed Shared Memory Systems , 1998, DISC.

[4]  Peter J. Keleher,et al.  Per-Node Multithreading and Remote Latency , 1998, IEEE Trans. Computers.

[5]  Ce-Kuen Shieh,et al.  Load balancing in distributed shared memory systems , 1997, 1997 IEEE International Performance, Computing and Communications Conference.

[6]  Su-Cheong Mac,et al.  Multi-Threaded Design for a Software Distributed Shared Memory System , 1999 .

[7]  Peter J. Keleher,et al.  Thread migration and load balancing in non-dedicated environments , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[8]  Alan L. Cox,et al.  TreadMarks: shared memory computing on networks of workstations , 1996 .

[9]  Willy Zwaenepoel,et al.  Implementation and performance of Munin , 1991, SOSP '91.

[10]  Kai Li,et al.  IVY: A Shared Virtual Memory System for Parallel Computing , 1988, ICPP.

[11]  Peter J. Keleher,et al.  Active correlation tracking , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[12]  Peter J. Keleher,et al.  Thread migration and communication minimization in DSM systems , 1999 .