Multidestination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths

This paper proposes multidestination message passing on wormhole k-ary n-cube networks using a new base-routing-conformed-path (BRCP) model. This model allows both unicast (single-destination) and multidestination messages to co-exist in a given network without leading to deadlock. The model is illustrated with several common routing schemes (deterministic, as well as adaptive), and the associated deadlock-freedom properties are analyzed. Using this model, a set of new algorithms for popular collective communication operations, broadcast and multicast, are proposed and evaluated. It is shown that the proposed algorithms can considerably reduce the latency of these operations compared to the Umesh (unicast-based multicast) and the Hamiltonian path-based schemes. A very interesting result that is presented shows that a multicast can be implemented with reduced or near-constant latency as the number of processors participating in the multicast increases beyond a certain number. It is also shown that the BRCP model can take advantage of adaptivity in routing schemes to further reduce the latency of these operations. The multidestination mechanism and the BRCP model establish a new foundation to provide fast and scalable collective communication support on wormhole-routed systems.

[1]  D. S. Scott,et al.  Efficient All-to-All Communication Patterns in Hypercube and Mesh Topologies , 1991, The Sixth Distributed Memory Computing Conference, 1991. Proceedings.

[2]  Lionel M. Ni,et al.  Multi-address Encoding for Multicast , 1994, PCRCW.

[3]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[4]  Dhabaleswar K. Panda,et al.  Efficient Broadcast and Multicast on Multistage Interconnection Networks Using Multiport Encoding , 1998, IEEE Trans. Parallel Distributed Syst..

[5]  Debashis Basak,et al.  Simulation of modern parallel systems: a CSIM-based approach , 1997, WSC '97.

[6]  Dhabaleswar K. Panda,et al.  Multicast on irregular switch-based networks with wormhole routing , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[7]  Sudhakar Yalamanchili,et al.  Adaptive routing protocols for hypercube interconnection networks , 1993, Computer.

[8]  J. Watts,et al.  Interprocessor collective communication library (InterCom) , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[9]  Dhabaleswar K. Panda,et al.  Global reduction in wormhole k-ary n-cube networks with multidestination exchange worms , 1995, Proceedings of 9th International Parallel Processing Symposium.

[10]  Dhabaleswar K. Panda,et al.  Implementing multidestination worms in switch-based parallel systems: architectural alternatives and their impact , 1997, ISCA '97.

[11]  Ming-Yang Kao,et al.  Efficient Broadcast on Hypercubes with Wormhole and E-Cube Routings , 1995, Parallel Process. Lett..

[12]  Herb Schwetman,et al.  Using CSIM to model complex systems , 1988, 1988 Winter Simulation Conference Proceedings.

[13]  Dhabaleswar K. Panda,et al.  Minimizing node contention in multiple multicast on wormhole k-ary n-cube networks , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[14]  Kang G. Shin,et al.  Traffic Routing for Multicomputer Networks with Virtual Cut-Through Capability , 1992, IEEE Trans. Computers.

[15]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[16]  Hong Xu,et al.  Efficient implementation of barrier synchronization in wormhole-routed hypercube multicomputers , 1992, [1992] Proceedings of the 12th International Conference on Distributed Computing Systems.

[17]  Lionel M. Ni,et al.  A survey of wormhole routing techniques in direct networks , 1993, Computer.

[18]  José Duato A Theory of Deadlock-Free Adaptive Multicast Routing in Wormhole Networks , 1995, IEEE Trans. Parallel Distributed Syst..

[19]  Dhabaleswar K. Panda,et al.  Where to provide support for efficient multicasting in irregular networks: network interface or switch? , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[20]  Dhabaleswar K. Panda,et al.  Efficient broadcast and multicast on multistage interconnection networks using multiport encoding , 1996, Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing.

[21]  G.D. Pifarre,et al.  Fully Adaptive Minimal Deadlock-Free Packet Routing in Hypercubes, Meshes, and other Networks: Algorithms and Simulations , 1994, IEEE Trans. Parallel Distributed Syst..

[22]  Hong Xu,et al.  Unicast-Based Multicast Communication in Wormhole-Routed Networks , 1994, IEEE Trans. Parallel Distributed Syst..

[23]  Dhabaleswar K. Panda,et al.  Multidestination Message Passing Mechanism Conforming to Base Wormhole Routing Scheme , 1994, PCRCW.

[24]  Prasant Mohapatra,et al.  Efficient and balanced adaptive routing in two-dimensional meshes , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[25]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[26]  Robert A. van de Geijn,et al.  Optimal Broadcasting in Mesh-Connected Architectures , 1991 .

[27]  Xiaola Lin,et al.  Deadlock-free multicast wormhole routing in multicomputer networks , 1991, ISCA '91.

[28]  Shahid H. Bokhari,et al.  Complete exchange on a circuit switched mesh , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..

[29]  Dhabaleswar K. Panda,et al.  Multicasting on Switch-Based Irregular Networks Using Multi-drop Path-Based Multidestination Worms , 1997, PCRCW.

[30]  Dhabaleswar K. PandaDept Issues in Designing Eecient and Practical Algorithms for Collective Communication on Wormhole-routed Systems , 1995 .

[31]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[32]  S. Lennart Johnsson,et al.  Optimum Broadcasting and Personalized Communication in Hypercubes , 1989, IEEE Trans. Computers.

[33]  Amotz Bar-Noy,et al.  Multiple message broadcasting in the postal model , 1993, [1993] Proceedings Seventh International Parallel Processing Symposium.

[34]  Dhabaleswar K. Panda,et al.  Multicasting in Irregular Networks with Cut-Through Switches Using Tree-Based Multidestination Worms , 1997, PCRCW.

[35]  Jehoshua Bruck,et al.  Multiple message broadcasting with generalized Fibonacci trees , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.

[36]  Suresh Chalasani,et al.  A comparison of adaptive wormhole routing algorithms , 1993, ISCA '93.

[37]  Dhabaleswar K. Panda Fast barrier synchronization in wormhole k-ary n-cube networks with multidestination worms , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[38]  Pierre Fraigniaud,et al.  Multicasting in Meshes , 1994, 1994 International Conference on Parallel Processing Vol. 3.

[39]  Chita R. Das,et al.  Modeling virtual channel flow control in hypercubes , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[40]  José Duato,et al.  A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks , 1993, IEEE Trans. Parallel Distributed Syst..

[41]  Andrew A. Chien,et al.  Planar-adaptive routing: low-cost adaptive networks for multiprocessors , 1992, ISCA '92.

[42]  D. S. Scott All-to-All Communication Patterns in Hypercubes and Mesh Topologies , 1991 .

[43]  Dhabaleswar K. Panda,et al.  Reducing cache invalidation overheads in wormhole routed DSMs using multidestination message passing , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[44]  Suresh Chalasani,et al.  Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks , 1995, IEEE Trans. Computers.

[45]  Cauligi S. Raghavendra,et al.  On multicast wormhole routing in multicomputer networks , 1994, Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing.

[46]  Dhabaleswar K. Panda,et al.  Impact of multiple consumption channels on wormhole routed k-ary n-cube networks , 1993, [1993] Proceedings Seventh International Parallel Processing Symposium.

[47]  Thu D. Nguyen,et al.  Performance Analysis of a Minimal Adaptive Router , 1994, PCRCW.

[48]  Xiaola Lin,et al.  Performance Evaluation of Multicast Wormhole Routing in 2D-Mesh Multicomputers , 1991, ICPP.