NIC-Based Reduction in Myrinet Clusters: Is It Beneficial?

Reduction-to-one and reduction-to-all operations are common operations in parallel and distributed systems. These operations are collective operations which can involve many processes. It is therefore important to make these operations fast and efficient. Some modern network interface controllers (NICs) for system area networks (SANs) have programmable processors which can be used to offload protocol processing from the host processor. In this paper we investigate the use of the NIC processor to improve the performance of reduction operations. We implemented a NIC-based reduction-to-one operation which can perform integer and floating point operations, and evaluated our implementation. Our evaluation shows that the NIC-based operation performs better than the traditional host-based approach with up to a 1.19 factor of improvement. We also see that using NIC-based reduction can reduce host CPU utilization by a factor of improvement of 2.7, and can reduce the effects of process skew by a factor of improvement of up to 4.5.

[1]  R. Sarnath,et al.  Proceedings of the International Conference on Parallel Processing , 1992 .

[2]  Robert A. van de Geijn,et al.  On Global Combine Operations , 1994, J. Parallel Distributed Comput..

[3]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[4]  Dhabaleswar K. Panda,et al.  Global reduction in wormhole k-ary n-cube networks with multidestination exchange worms , 1995, Proceedings of 9th International Parallel Processing Symposium.

[5]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[6]  Kees Verstoep,et al.  Efficient reliable multicast on Myrinet , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[7]  Henri E. Bal,et al.  Efficient multicast on Myrinet using link-level flow control , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[8]  Dhabaleswar K. Panda,et al.  Broadcast/Multicast over Myrinet Using NIC-Assisted Multidestination Messages , 2000, CANPC.

[9]  Dhabaleswar K. Panda,et al.  Fast NIC-based barrier over Myrinet/GM , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.