Supporting MPI collective communication on network processors

We present work that extends our previous Myrinet port for LAM/MPI, MPI-NP, with collective communication primitives on the NIC. This work is another step in our experiment of making the NIC MPI aware. We believe that an MPI aware control program on the NIC can deliver a richer set of performance enhancements, not just restricted to better bandwidth/latency, to MPI applications. MPI collective communication involves considerable interactions between the communication subsystems of the nodes that are not of any direct interest to the application. By migrating these talkative components to the Myrinet network interface card we allow this dialog between the nodes to happen with minimum latency. We explore the advantage of supporting several MPI collective communication routines on the NIC. These include MPI /spl I.bar/Bcast (), MPI/spl I.bar/Barrier and MPI/spl I.bar/Comm/spl I.bar/Create ().

[1]  Bernard Tourancheau,et al.  BIP: A New Protocol Designed for High Performance Networking on Myrinet , 1998, IPPS/SPDP Workshops.

[2]  Henri E. Bal,et al.  Efficient multicast on Myrinet using link-level flow control , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[3]  Scott Pakin,et al.  High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[4]  Bernard Tourancheau,et al.  The Design for a High-Performance MPI Implementation on the Myrinet Network , 1999, PVM/MPI.

[5]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[6]  Chamath Keppitiyagama,et al.  Asynchronous MPI messaging on Myrinet , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[7]  Dhabaleswar K. Panda,et al.  Broadcast/Multicast over Myrinet Using NIC-Assisted Multidestination Messages , 2000, CANPC.

[8]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[9]  Hiroshi Tezuka,et al.  Pin-down cache: a virtual memory management technique for zero-copy communication , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[10]  Dhabaleswar K. Panda,et al.  Fast NIC-based barrier over Myrinet/GM , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[11]  Dhabaleswar K. Panda,et al.  Performance benefits of NIC-based barrier on myrinet/GM , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[12]  Mario Lauria,et al.  MPI-FM: High Performance MPI on Workstation Clusters , 1997, J. Parallel Distributed Comput..

[13]  Kees Verstoep,et al.  Efficient reliable multicast on Myrinet , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.