Collective Communication in Wormhole-Routed Massively Parallel Computers

Most MPC networks use wormhole routing to reduce the effect of path length on communication time. Researchers have exploited this by designing ingenious algorithms to speed collective communication. Many projects have addressed the design of efficient collective communication algorithms for wormhole-routed systems. By exploiting the relative distance-insensitivity of wormhole routing, these new algorithms often differ fundamentally from their store-and-forward counterparts. We examine software and hardware approaches to implementing collective communication operations. Although we emphasize methods in which the underlying architecture is a direct network, such as a hypercube or mesh, as opposed to an indirect switch-based network, several approaches apply to systems of either type. We illustrate several issues arising in this research area and describe the major classes of algorithms proposed to solve these problems.

[1]  A large scale, homogeneous, fully distributed parallel machine, I , 1977, ISCA '77.

[2]  Arthur L. Liestman,et al.  A survey of gossiping and broadcasting in communication networks , 1988, Networks.

[3]  S. Lennart Johnsson,et al.  Optimum Broadcasting and Personalized Communication in Hypercubes , 1989, IEEE Trans. Computers.

[4]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..

[5]  Marina C. Chen,et al.  Compiling Communication-Efficient Programs for Massively Parallel Machines , 1991, IEEE Trans. Parallel Distributed Syst..

[6]  Robert A. van de Geijn,et al.  Optimal Broadcasting in Mesh-Connected Architectures , 1991 .

[7]  Bill Nitzberg,et al.  Distributed shared memory: a survey of issues and algorithms , 1991, Computer.

[8]  William J. Dally Virtual-Channel Flow Control , 1992, IEEE Trans. Parallel Distributed Syst..

[9]  Ken Kennedy,et al.  Compiling Fortran D for MIMD distributed-memory machines , 1992, CACM.

[10]  Amotz Bar-Noy,et al.  Designing broadcasting algorithms in the postal model for message-passing systems , 1992, SPAA '92.

[11]  Shahid H. Bokhari,et al.  Complete exchange on a circuit switched mesh , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..

[12]  Jack Dongarra,et al.  ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers , 1992, [Proceedings 1992] The Fourth Symposium on the Frontiers of Massively Parallel Computation.

[13]  Geoffrey C. Fox,et al.  Experimental Performance Evaluation of the CM-5 , 1993, J. Parallel Distributed Comput..

[14]  Philip K. McKinley,et al.  Efficient Broadcast in All-Port Wormhole-Routed Hypercubes , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[15]  Lionel M. Ni,et al.  A survey of wormhole routing techniques in direct networks , 1993, Computer.

[16]  Geoffrey C. Fox,et al.  Performance comparison of the CM-5 and Intel Touchstone Delta for data parallel operations , 1993, Proceedings of 1993 5th IEEE Symposium on Parallel and Distributed Processing.

[17]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[18]  D.F. Robinson,et al.  Efficient collective data distribution in all-port wormhole-routed hypercubes , 1993, Supercomputing '93. Proceedings.

[19]  Robert A. van de Geijn,et al.  Two Dimensional Basic Linear Algebra Communication Subprograms , 1993, PPSC.

[20]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[21]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[22]  Pierre Fraigniaud,et al.  Methods and problems of communication in usual networks , 1994, Discret. Appl. Math..

[23]  Yih-Jia Tsai,et al.  An extended dominating node approach to collective communication in all-port wormhole-routed 2D meshes , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[24]  Dhabaleswar K. Panda,et al.  Multidestination Message Passing Mechanism Conforming to Base Wormhole Routing Scheme , 1994, PCRCW.

[25]  Hong Xu,et al.  Unicast-Based Multicast Communication in Wormhole-Routed Networks , 1994, IEEE Trans. Parallel Distributed Syst..

[26]  Eric A. Brewer,et al.  How to get good performance from the CM-5 data network , 1994, Proceedings of 8th International Parallel Processing Symposium.

[27]  Xiaola Lin,et al.  Deadlock-Free Multicast Wormhole Routing in 2-D Mesh Multicomputers , 1994, IEEE Trans. Parallel Distributed Syst..

[28]  Cauligi S. Raghavendra,et al.  On multicast wormhole routing in multicomputer networks , 1994, Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing.

[29]  Rajeev Thakur,et al.  All-to-all communication on meshes with wormhole routing , 1994, Proceedings of 8th International Parallel Processing Symposium.

[30]  J. Watts,et al.  Interprocessor collective communication library (InterCom) , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[31]  Philip K. McKinley,et al.  Communication issues in parallel computing across ATM networks , 1994, IEEE Parallel & Distributed Technology: Systems & Applications.

[32]  Ming-Yang Kao,et al.  Optimal Broadcast in All-Port Wormhole-Routed Hypercubes , 1994, ICPP.

[33]  Michael Metcalf,et al.  High performance Fortran , 1995 .

[34]  C. T. Howard Ho,et al.  Efficient Multi-Packet Multicast Algorithms on Meshes with Wormhole and Dimension-Ordered Routing , 1995, ICPP.

[35]  Jehoshua Bruck,et al.  CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers , 1995, IEEE Trans. Parallel Distributed Syst..

[36]  Rudy Lauwereins,et al.  On the Design and Implementation of Broadcast and Global Combine Operations Using the Postal Model , 1996, IEEE Trans. Parallel Distributed Syst..

[37]  W. Daniel Hillis,et al.  The Network Architecture of the Connection Machine CM-5 , 1996, J. Parallel Distributed Comput..

[38]  Robert A. van de Geijn,et al.  Broadcasting on Meshes with Wormhole Routing , 1996, J. Parallel Distributed Comput..

[39]  D. G. Payne,et al.  Broadcasting on Meshes with Worm-hole Routing , 1996 .