Zero copy sockets direct protocol over infiniband-preliminary implementation and performance analysis

Sockets direct protocol (SDP) is a byte-stream transport protocol implementing the TCP SOCK/spl I.bar/STREAM semantics utilizing transport offloading capabilities of the infiniband fabric: Under the hood, SDP supports zero-copy (ZCopy) operation mode, using the infiniband RDMA capability to transfer data directly between application buffers. Alternatively, in buffer copy (BCopy) mode, data is copied to and from transport buffers. In the initial open-source SDP implementation, ZCopy mode was restricted to asynchronous I/O operations. We added a prototype ZCopy support for send()/recv() synchronous socket calls. This paper presents the major architectural aspects of the SDP protocol, the ZCopy implementation, and a preliminary performance evaluation. We show substantial benefits of ZCopy when multiple connections are running in parallel on the same host. For example, when 8 connections are simultaneously active, enabling ZCopy yields a bandwidth growth from 500 MB/s to 700 MB/s, while CPU utilization decreases 8 times.

[1]  Guru M. Parulkar,et al.  Axon: a high speed communication architecture for distributed applications , 1990, Proceedings. IEEE INFOCOM '90: Ninth Annual Joint Conference of the IEEE Computer and Communications Societies@m_The Multiple Facets of Integration.

[2]  Dhabaleswar K. Panda,et al.  Host-assisted zero-copy remote memory access communication on InfiniBand , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[3]  Dhabaleswar K. Panda,et al.  Sockets Direct Protocol over InfiniBand in clusters: is it beneficial? , 2004, IEEE International Symposium on - ISPASS Performance Analysis of Systems and Software, 2004.

[4]  David Clark,et al.  An analysis of TCP processing overhead , 1989 .

[5]  Hiroshi Tezuka,et al.  Pin-down cache: a virtual memory management technique for zero-copy communication , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[6]  David D. Clark,et al.  An analysis of TCP processing overhead , 1988, IEEE Communications Magazine.

[7]  Dhabaleswar K. Panda,et al.  Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over , 2004 .

[8]  Dhabaleswar K. Panda,et al.  PVFS over InfiniBand: design and performance evaluation , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[9]  Dan Bonachea,et al.  A new DMA registration strategy for pinning-based high performance networks , 2003, Proceedings International Parallel and Distributed Processing Symposium.