Dome: parallel programming in a distributed computing environment

The Distributed object migration environment (Dome) addresses three major issues of distributed parallel programming: ease of use, load balancing, and fault tolerance. Dome provides process control, data distribution, communication, and synchronization for Dome programs running in a heterogeneous distributed computing environment. The parallel programmer writes a C++ program using Dome objects which are automatically partitioned and distributed over a network of computers. Dome incorporates a load balancing facility that automatically adjusts the mapping of objects to machines at runtime, exhibiting significant performance gains over standard message passing programs executing in an imbalanced system. Dome also provides checkpointing of program state in an architecture independent manner allowing Dome programs to be checkpointed on one architecture and restarted on another.

[1]  Edward D. Lazowska,et al.  Adaptive load sharing in homogeneous distributed systems , 1986, IEEE Transactions on Software Engineering.

[2]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[3]  Jack J. Dongarra,et al.  Recent Enhancements To Pvm , 1995, Int. J. High Perform. Comput. Appl..

[4]  Edward D. Lazowska,et al.  A Comparison of Receiver-Initiated and Sender-Initiated Adaptive Load Sharing , 1986, Perform. Evaluation.

[5]  James M. Purtilo,et al.  Dynamic reconfiguration in distributed systems: adapting software modules for replacement , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[6]  Jack Dongarra,et al.  LAPACK Working Note 61: An Object Oriented Design for High Performance Linear Algebra on Distributed Memory Architectures , 1993 .

[7]  T. Kunz The Innuence of Diierent Workload Descriptions on a Heuristic Load Balancing Scheme the Innuence of Diierent Workload Descriptions on a Heuristic Load Balancing Scheme , 2007 .

[8]  A. Malony,et al.  Implementing a parallel C++ runtime system for scalable parallel systems , 1993, Supercomputing '93.

[9]  Rice UniversityCORPORATE,et al.  High performance Fortran language specification , 1993 .

[10]  Charles L. Brooks,et al.  Implementation of a data parallel, logical domain decomposition method for interparticle interactions in molecular dynamics of structured molecular fluids , 1994, J. Comput. Chem..

[11]  Multicomputers S. Plank ickp: A Consistent Checkpointer for , 1994 .

[12]  Domenico Ferrari,et al.  An Empirical Investigation of Load Indices for Load Balancing Applications , 1987, Performance.

[13]  D. W. Duke,et al.  Research toward a heterogeneous networked computing cluster , 1998 .

[14]  Jack Dongarra,et al.  PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing , 1995 .

[15]  Capstick,et al.  Baryon current matrix elements in a light-front framework. , 1995, Physical review. D, Particles and fields.

[16]  Kai Li,et al.  ickp: a consistent checkpointer for multicomputers , 1994, IEEE Parallel & Distributed Technology: Systems & Applications.

[17]  Bruce Lowekamp,et al.  ECO: Efficient Collective Operations for communication on heterogeneous networks , 1996, Proceedings of International Conference on Parallel Processing.

[18]  Thomas Kunz,et al.  The Influence of Different Workload Descriptions on a Heuristic Load Balancing Scheme , 1991, IEEE Trans. Software Eng..

[19]  Shahid H. Bokhari,et al.  Assignment Problems in Parallel and Distributed Computing , 1987 .

[20]  Erik Seligman,et al.  High-Level Fault Tolerance in Distributed Programs , 1994 .

[21]  M. Moura Silva,et al.  Checkpointing SPMD applications on transputer networks , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[22]  James H. Patterson,et al.  Portable Programs for Parallel Processors , 1987 .

[23]  Nicholas Carriero,et al.  How to write parallel programs: a guide to the perplexed , 1989, CSUR.