Distributed scheduling of unstructured collective communication on the CM-5

Parallelization of irregular applications often results in unstructured collective communication. We present a distributed algorithm for scheduling such communication on parallel machines. We describe the performance of this algorithm on the CM-5 and show that the scheduling algorithm gives a significant improvement over naive methods.<<ETX>>

[1]  Lionel M. Ni,et al.  A survey of wormhole routing techniques in direct networks , 1993, Computer.

[2]  Geoffrey C. Fox,et al.  Benchmarking the CM-5 multicomputer , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[3]  Geoffrey C. Fox,et al.  Parallel Genetic Algorithms with Application to Load Balancing for Parallel Computing , 1991 .

[4]  Sanjay Ranka,et al.  Static and Runtime Scheduling of Unstructured Communication , 1993 .

[5]  S. Eisenstat,et al.  An experimental study of methods for parallel preconditioned Krylov methods , 1989, C3P.

[6]  Sanjay Ranka,et al.  Personalized Communication Avoiding Node Contention on Distributed Memory Systems , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[7]  R. Walters,et al.  Solution algorithms for the two-dimensional Euler equations on unstructured meshes , 1990 .

[8]  Dimitri J. Mavriplis Three dimensional unstructured multigrid for the Euler equations , 1991 .

[9]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .

[10]  Seth Copen Goldstein,et al.  Active messages: a mechanism for integrating communication and computation , 1998, ISCA '98.

[11]  Yousef Saad,et al.  Solving Sparse Triangular Linear Systems on Parallel Computers , 1989, Int. J. High Speed Comput..

[12]  Seth Copen Goldstein,et al.  Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[13]  Viktor K. Decyk,et al.  A general concurrent algorithm for plasma particle-in-cell simulation codes , 1989 .

[14]  David W. Walker,et al.  Characterizing the Parallel Performance of a Large-scale, Particle-in-cell Plasma Simulation Code , 1990, Concurr. Pract. Exp..

[15]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.