A Scalable Diffusion Algorithm for Dynamic Mapping and Load Balancing on Networks of Arbitrary Topology

The problems of mapping and load balancing applications on arbitrary networks are considered. A novel diffusion algorithm is presented to solve the mapping problem. It complements the well known diffusion algorithms for load balancing which have enjoyed success on massively parallel computers (MPPs). Mapping is more difficult on interconnection networks than on MPPs because of the variations which occur in network topology. Popular mapping algorithms for MPPs which depend on recursive topologies are not applicable to irregular networks. The most celebrated of these MPP algorithms use information from the Laplacian matrix of a graph of communicating processes. The diffusion algorithm presented in this paper is also derived from this Laplacian matrix. The diffusion algorithm works on arbitrary network topologies and is dramatically faster than the celebrated MPP algorithms. It is delay and fault tolerant. Time to convergence depends on initial conditions and is insensitive to problem scale. This excellent scalability, among other features, makes the diffusion algorithm a viable candidate for dynamically mapping and load balancing not only existing MPP systems but also large distributed systems like the Internet, small cluster computers, and networks of workstations.

[1]  C. A. R. Hoare,et al.  Communicating sequential processes , 1978, CACM.

[2]  Alan Heirich Scalable Load Balancing by Diffusion , 1994 .

[3]  Marina C. Chen,et al.  From Local to Global: An Analysis of Nearest Neighbor Balancing on Hypercubes , 1988, SIGMETRICS.

[4]  Shahid H. Bokhari,et al.  On the Mapping Problem , 1981, IEEE Transactions on Computers.

[5]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[6]  Charbel Farhat,et al.  Beyond Conventional Mesh Partitioning Algorithms and the Minimum Edge Cut Criterion: Impact on Realistic Realistic Applications , 1995, PP.

[7]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[8]  Alan Heirich,et al.  A Parabolic Load Balancing Method , 1995, ICPP.

[9]  Thomas L. Sterling The scientific workstation of the future may be a pile of PCs , 1996, CACM.

[10]  Thomas L. Sterling,et al.  BEOWULF: A Parallel Workstation for Scientific Computation , 1995, ICPP.

[11]  David E. Culler,et al.  A case for NOW (networks of workstation) , 1995, PODC '95.

[12]  Andrew J. Conley Using a Transfer Function to Describe the Load-Balancing Problem , 1993 .

[13]  K. Mani Chandy,et al.  Parallel program design - a foundation , 1988 .

[14]  George Cybenko,et al.  Dynamic Load Balancing for Distributed Memory Multiprocessors , 1989, J. Parallel Distributed Comput..

[15]  Roy D. Williams,et al.  Performance of dynamic load balancing algorithms for unstructured mesh calculations , 1991, Concurr. Pract. Exp..

[16]  Anthony P. Reeves,et al.  Strategies for Dynamic Load Balancing on Highly Parallel Computers , 1993, IEEE Trans. Parallel Distributed Syst..

[17]  K. Mani Chandy,et al.  Distributed computation on graphs: shortest path algorithms , 1982, CACM.

[18]  Francis C. M. Lau,et al.  Anlaysis of the Generalized Dimension Exchange Method for Dynamic Load Balancing , 1992, J. Parallel Distributed Comput..

[19]  Louis A. Hageman,et al.  Iterative Solution of Large Linear Systems. , 1971 .

[20]  Edsger W. Dijkstra,et al.  A Discipline of Programming , 1976 .

[21]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[22]  J. E. Boillat,et al.  A dynamic load-balancing algorithm for molecular dynamics simulation on multi-processor systems , 1991 .

[23]  Francis C. M. Lau,et al.  The Generalized Dimension Exchange Method for Load Balancing in k-ary n Cubes and Variants , 1995, J. Parallel Distributed Comput..

[24]  Horst D. Simon,et al.  A Parallel Implementation of Multilevel Recursive Spectral Bisection for Application to Adaptive Unstructured Meshes. Chapter 1 , 1994 .

[25]  Ed Zaluska,et al.  Parallel Load-Balancing: An Extension to the Gradient Model , 1995, Parallel Comput..

[26]  Seyed Hossein Hosseini,et al.  Analysis of a Graph Coloring Based Distributed Load Balancing Algorithm , 1990, J. Parallel Distributed Comput..

[27]  Stein Gjessing,et al.  Hardware support for synchronization in the Scalable Coherent Interface (SCI) , 1994, Proceedings of 8th International Parallel Processing Symposium.

[28]  Edsger W. Dijkstra,et al.  Termination Detection for Diffusing Computations , 1980, Inf. Process. Lett..

[29]  Scott B. Baden,et al.  Programming Abstractions for Dynamically Partitioning and Coordinating Localized Scientific Calculations Running on Multiprocessors , 1991, SIAM J. Sci. Comput..

[30]  Alain J. Martin A Distributed Implementation Method for Parallel Programming , 1980, IFIP Congress.

[31]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[32]  Robert M. Keller,et al.  The Gradient Model Load Balancing Method , 1987, IEEE Transactions on Software Engineering.

[33]  B. Mohar THE LAPLACIAN SPECTRUM OF GRAPHS y , 1991 .

[34]  J. Ortega Introduction to Parallel and Vector Solution of Linear Systems , 1988, Frontiers of Computer Science.

[35]  Charbel Farhat,et al.  A retrofit based methodology for the fast generation and optimization of large-scale mesh partitions: Beyond the minimum interface size criterion , 1996 .

[36]  K. Mani Chandy,et al.  Termination Detection of Diffusing Computations in Communicating Sequential Processes , 1982, TOPL.

[37]  Jacques E. Boillat,et al.  Load Balancing and Poisson Equation in a Graph , 1990, Concurr. Pract. Exp..