Dynamite - Blasting Obstacles to Parallel Cluster Computing

Workstations make up a very large fraction of the total available computing capacity in many organisations. In order to use this capacity optimally, dynamic allocation of computing resources is needed. The Esprit project Dynamite addresses this load balancing problem through the migration of tasks in a dynamically linked parallel program. An important goal of the project is to accomplish this in a manner that is transparent both to the application programmer and to the user. As a test bed, the Pam-Crash software from ESI is used.

[1]  Jonathan Walpole,et al.  MIST: PVM with Transparent Migration and Checkpointing , 1995 .

[2]  Reinhard von Hanxleden,et al.  Load Balancing on Message Passing Architectures , 1991, J. Parallel Distributed Comput..

[3]  Jingwen Wang,et al.  Utopia: A load sharing facility for large, heterogeneous distributed computer systems , 1993, Softw. Pract. Exp..

[4]  Miron Livny,et al.  Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System , 1997 .

[5]  Jonathan Robinson,et al.  A task migration implementation of the Message-Passing Interface , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[6]  Mounir Hamdi,et al.  Dynamic load balancing of data parallel applications on a distributed network , 1995, ICS '95.

[7]  Peter M. A. Sloot,et al.  A dynamic load balancing system for parallel cluster computing , 1996, Future Gener. Comput. Syst..

[8]  J.H. van Schuppen,et al.  Distributed load balancing , 1988, Proceedings of the 27th IEEE Conference on Decision and Control.

[9]  Jonathan Walpole,et al.  MPVM: A Migration Transparent Version of PVM , 1995, Comput. Syst..

[10]  Jonathan Walpole,et al.  Adaptive load migration systems for PVM , 1994, Proceedings of Supercomputing '94.

[11]  Anurag Kumar,et al.  Adaptive Optimal Load Balancing in a Nonhomogeneous Multiserver System with a Central Job Scheduler , 1990, IEEE Trans. Computers.

[12]  R. Diekman,et al.  Load balancing strategies for distributed memory machines , 2000 .

[13]  Miron Livny,et al.  Managing Checkpoints for Parallel Programs , 1996, JSSPP.