Near-Optimal All-to-All Broadcast in Multidimensional All-Port Meshes and Tori

All-to-all communication is one of the most dense collective communication patterns and occurs in many important applications in parallel and distributed computing. In this paper, we present a new all-to-all broadcast algorithm in multidimensional all-port mesh and torus networks. We propose a broadcast pattern which ensures a balanced traffic load in all dimensions in the network so that the all-to-all broadcast algorithm can achieve a very tight near-optimal transmission time. The algorithm also takes advantage of overlapping of message switching time and transmission time, and the total communication delay asymptotically matches the lower bound of all-to-all broadcast. Finally, the algorithm is conceptually simple and symmetrical for every message and every node so that it can be easily implemented in hardware and achieves the near-optimum in practice.

[1]  Sandeep K. S. Gupta,et al.  All-to-All Personalized Communication in a Wormhole-Routed Torus , 1996, IEEE Trans. Parallel Distributed Syst..

[2]  Young-Joo Suh,et al.  All-To-All Communication with Minimum Start-Up Costs in 2D/3D Tori and Meshes , 1998, IEEE Trans. Parallel Distributed Syst..

[3]  S. Lennart Johnsson,et al.  Optimum Broadcasting and Personalized Communication in Hypercubes , 1989, IEEE Trans. Computers.

[4]  G. C. Fox,et al.  Solving Problems on Concurrent Processors , 1988 .

[5]  Dennis Gannon,et al.  On the Impact of Communication Complexity on the Design of Parallel Numerical Algorithms , 1984, IEEE Transactions on Computers.

[6]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[7]  Young-Joo Suh,et al.  All-to-All Personalized Communication in Multidimensional Torus and Mesh Networks , 2001, IEEE Trans. Parallel Distributed Syst..

[8]  Ulrich Meyer,et al.  Time-independent gossiping on full-port tori , 1998 .

[9]  D. S. Scott,et al.  Efficient All-to-All Communication Patterns in Hypercube and Mesh Topologies , 1991, The Sixth Distributed Memory Computing Conference, 1991. Proceedings.

[10]  Yuanyuan Yang,et al.  Optimal All-to-All Personalized Exchange in a Class of Optical Multistage Networks , 2001, IEEE Trans. Parallel Distributed Syst..

[11]  Fikret Erçal,et al.  Time-Efficient Maze Routing Algorithms on Reconfigurable Mesh Architectures , 1997, J. Parallel Distributed Comput..

[12]  Yuanyuan Yang,et al.  Pipelined All-to-All Broadcast in All-Port Meshes and Tori , 2001, IEEE Trans. Computers.

[13]  Michal Soch,et al.  Time-Optimal Gossip of Large Packets in Noncombining 2D Tori and Meshes , 1999, IEEE Trans. Parallel Distributed Syst..

[14]  Yuanyuan Yang,et al.  Optimal All-to-All Personalized Exchange in Self-Routable Multistage Networks , 2000, IEEE Trans. Parallel Distributed Syst..

[15]  Ben H. H. Juurlink,et al.  Gossiping on Meshes and Tori , 1998, IEEE Trans. Parallel Distributed Syst..

[16]  Yu-Chee Tseng,et al.  Bandwidth-Optimal Complete Exchange on Wormhole-Routed 2D/3D Torus Networks: A Diagonal-Propagation Approach , 1997, IEEE Trans. Parallel Distributed Syst..

[17]  F. Petrini,et al.  Total-exchange on wormhole k-ary n-cubes with adaptive routing , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[18]  Yousef Saad,et al.  Data communication in parallel architectures , 1989, Parallel Comput..

[19]  Mark A. Johnson,et al.  Solving problems on concurrent processors. Vol. 1: General techniques and regular problems , 1988 .

[20]  Young-Joo Suh,et al.  Configurable Algorithms for Complete Exchange in 2D Meshes , 2000, IEEE Trans. Parallel Distributed Syst..

[21]  Stéphane Pérennes,et al.  All-to-all broadcast in torus with wormhole-like routing , 1995, Proceedings.Seventh IEEE Symposium on Parallel and Distributed Processing.

[22]  Jehoshua Bruck,et al.  Efficient algorithms for all-to-all communications in multi-port message-passing systems , 1994, SPAA '94.

[23]  Rajeev Thakur,et al.  All-to-all communication on meshes with wormhole routing , 1994, Proceedings of 8th International Parallel Processing Symposium.