Thread migration and load balancing in non-dedicated environments

Networks of workstations are fast becoming the standard environment for parallel applications. However, the use of "found" resources as a platform for tightly-coupled runtime environments has at least three obstacles: contention for resources, differing processor speeds, and processor heterogeneity. All three obstacles result in load imbalance, leading to poor performance for scientific applications. This paper describes the use of thread migration in transparently addressing this load imbalance in the context of the CVM software distributed shared memory system. We describe the implementation and performance of mechanisms and policies that accommodate both resource contention, and heterogeneity in clock speed and processor type. Our results show that these cycles can indeed be effectively exploited, and that the runtime cost of processor heterogeneity can be quite manageable. Along the way, however, we identify a number of problems that need to be addressed before such systems can enjoy widespread use.

[1]  A. Gupta,et al.  The Stanford FLASH multiprocessor , 1994, Proceedings of 21 International Symposium on Computer Architecture.

[2]  J.P. Singh,et al.  Scaling application performance on a cache-coherent multiprocessors , 1999, Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367).

[3]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[4]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[5]  Todd C. Mowry,et al.  Comparative evaluation of latency tolerance techniques for software distributed shared memory , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[6]  Steven W. K. Tjiang,et al.  SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.

[7]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[8]  Peter J. Keleher,et al.  Update protocols and iterative scientific applications , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[9]  Peter J. Keleher,et al.  Active correlation tracking , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[10]  Peter J. Keleher,et al.  The relative importance of concurrent writers and weak consistency models , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[11]  Peter J. Keleher,et al.  Per-Node Multithreading and Remote Latency , 1998, IEEE Trans. Computers.