10-Gigabit iWARP Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G

iWARP is a set of standards enabling remote direct memory access (RDMA) over Ethernet. iWARP supporting RDMA and OS bypass, coupled with TCP/IP offload engines, can fully eliminate the host CPU involvement in an Ethernet environment. With the iWARP standard and the introduction of 10-Gigabit Ethernet, there is now an alternative path to the proprietary interconnects for high-performance computing, while maintaining compatibility with existing Ethernet infrastructure and protocols. Recently, NetEffect Inc. has introduced an iWARP-enabled 10-Gigabit Ethernet channel adapter. In this paper we assess the potential of such an interconnect for high-performance computing by comparing its performance with two leading cluster interconnects, infiniband and myrinet-10G. The results show that the NetEffect iWARP implementation achieves an unprecedented latency for Ethernet, and saturates 87% of the available bandwidth. It also scales better with multiple connections. At the MPI level, iWARP performs better than infiniband in queue usage and buffer re-use.

[1]  J. Nieplocha,et al.  QSNET/sup II/: defining high-performance network design , 2005, IEEE Micro.

[2]  Hyun-Wook Jin,et al.  Supporting iWARP Compatibility and Features for Regular Network Adapters , 2005, 2005 IEEE International Conference on Cluster Computing.

[3]  Pete Wyckoff,et al.  A Performance Analysis of the Ammasso RDMA Enabled Ethernet Adapter and its iWARP API , 2005, 2005 IEEE International Conference on Cluster Computing.

[4]  Kees Verstoep,et al.  Fast Measurement of LogP Parameters for Message Passing Platforms , 2000, IPDPS Workshops.

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  Alan D. George,et al.  Comparative Performance Analysis of RDMA-Enhanced Ethernet , 2005 .

[7]  Hemal Shah,et al.  Direct Data Placement over Reliable Transports , 2007, RFC.

[8]  Renato Recio,et al.  Marker PDU Aligned Framing for TCP Specification , 2007, RFC.

[9]  Hyun-Wook Jin,et al.  Performance Evaluation of RDMA over IP: A Case Study with the Ammasso Gigabit Ethernet NIC , 2005 .

[10]  Pete Wyckoff,et al.  Initial Performance Evaluation of the NetEffect 10 Gigabit iWARP Adapter , 2006, 2006 IEEE International Conference on Cluster Computing.

[11]  Ying Qian,et al.  An evaluation of the Myrinet/GM2 two-port networks , 2004, 29th Annual IEEE International Conference on Local Computer Networks.

[12]  Keith D. Underwood,et al.  A comparison of 4X InfiniBand and Quadrics Elan-4 technologies , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[13]  Dhabaleswar K. Panda,et al.  Bridging the Ethernet-Ethernot Performance Gap , 2006, IEEE Micro.

[14]  Dhabaleswar K. Panda,et al.  Microbenchmark performance comparison of high-speed cluster interconnects , 2004, IEEE Micro.

[15]  Sayantan Sur,et al.  RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits , 2006, PPoPP '06.

[16]  Jeff Hilland RDMA Protocol Verbs Specification , 2003 .

[17]  Pete Wyckoff,et al.  iWarp protocol kernel space software implementation , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[18]  Jason Duell,et al.  An evaluation of current high-performance networks , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[19]  Ying Qian,et al.  Efficient RDMA-based multi-port collectives on multi-rail QsNet/sup II/ clusters , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[20]  Keith D. Underwood,et al.  The impact of MPI queue usage on message latency , 2004, International Conference on Parallel Processing, 2004. ICPP 2004..

[21]  M. Hughes,et al.  Performance Analysis , 2018, Encyclopedia of Algorithms.