Latency hiding in dynamic partitioning and load balancing of grid computing applications

The Information Power Grid (IPG) concept developed by NASA is aimed to provide a metacomputing platform for large-scale distributed computations, by hiding the intricacies of a highly heterogeneous environment and yet maintaining adequate security. We propose a latency-tolerant partitioning scheme that dynamically balances processor workloads on the IPG, and minimizes data movement and runtime communication. By simulating an unsteady adaptive mesh application on a wide area network, we study the performance of our load balancer under the Globus environment. The number of IPG nodes, the number of processors per node, and the interconnect speeds are parameterized to derive conditions under which the IPG would be suitable for parallel distributed processing of such applications. Experimental results demonstrate that effective solutions are achieved when. The IPG nodes are connected by a high-speed asynchronous interconnection network.

[1]  Ian T. Foster,et al.  A Grid-Enabled MPI: Message Passing in Heterogeneous Distributed Computing Systems , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[2]  R. Biswas,et al.  Large-scale distributed computational fluid dynamics on the information power grid using Globus , 1999, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[3]  Vipin Kumar,et al.  Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes , 1997, J. Parallel Distributed Comput..

[4]  Martin G. Everett,et al.  Parallel Dynamic Graph Partitioning for Adaptive Unstructured Meshes , 1997, J. Parallel Distributed Comput..

[5]  Leonid Oliker,et al.  Parallel dynamic load balancing strategies for adaptive irregular applications , 2000 .

[6]  Paul E. Plassmann,et al.  Remote Engineering Tools for the Design of Pollution Control Systems for Commercial Boilers , 1996 .

[7]  Rick Stevens,et al.  Sharing visualization experiences among remote virtual environments , 1995 .

[8]  Thomas A. DeFanti,et al.  Virtual Reality Over High-Speed Networks , 1996, IEEE Computer Graphics and Applications.

[9]  Andrew S. Grimshaw,et al.  The Legion vision of a worldwide virtual computer , 1997, Commun. ACM.

[10]  Michael P. Mesnier,et al.  The Network-Enabled Optimization System (neos) Server , 1996 .

[11]  Sajal K. Das,et al.  Parallel processing of adaptive meshes with load balancing , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[12]  Rupak Biswas,et al.  Parallel Load Balancing for Adaptive Unstructured Meshes , 1998 .

[13]  David Kotz,et al.  A Performance Comparison of TCP/IP and MPI on FDDI, Fast Ethernet, and Ethernet , 1996 .

[14]  Vipin Kumar,et al.  Parallel Multilevel k-way Partitioning Scheme for Irregular Graphs , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[15]  Bruce Hendrickson,et al.  A Multi-Level Algorithm For Partitioning Graphs , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[16]  David Abramson,et al.  Nimrod: a tool for performing parametrised simulations using distributed workstations , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[17]  Andrew E. Johnson,et al.  Cavern: a distributed architecture for supporting scalable persistence and interoperability in colla , 1997 .

[18]  Ian T. Foster,et al.  Overview of the I-Way: Wide-Area Visual Supercomputing , 1996, Int. J. High Perform. Comput. Appl..

[19]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[20]  Henri Casanova,et al.  Netsolve: a Network-Enabled Server for Solving Computational Science Problems , 1997, Int. J. High Perform. Comput. Appl..

[21]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[22]  Leonid Oliker,et al.  PLUM: Parallel Load Balancing for Adaptive Unstructured Meshes , 1998, J. Parallel Distributed Comput..

[23]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[24]  George Cybenko,et al.  Dynamic Load Balancing for Distributed Memory Multiprocessors , 1989, J. Parallel Distributed Comput..