Implementation of an RDMA verbs driver for GridFTP

GridFTP is used by researchers to move large data-sets across grid networks. Its benefits include data striping, multiple parallel data channels, and check-pointing. Like traditional FTP, GridFTP separates the control channel from the data channel, but also includes extensions to allow for third-party control of the transfer. GridFTP is one component of the Globus Toolkit, a software system designed to allow researchers to easily adapt, extend, and modify GridFTP and other tools to meet their specific needs. The eXtensible Input/ Output (XIO) component, which provides common and logging libraries as well as built-in transport drivers such as TCP and UDP, defines the architecture and API used for implementing new drivers. Remote Direct Memory Access, or RDMA, is a network communication architecture which defines a number of features not available with traditional sockets based communication. With RDMA all I/O operations are offloaded to hardware. Combined with the use of registered memory, this eliminates the intermediate copying of data to kernel buffers as is done with TCP. This results in a significant reduction in latency and CPU overhead because the kernel TCP/IP stack is not used. The RDMA architecture is implemented with InfiniBand cables and interface cards, or with Ethernet cables and RoCE or iWARP interface cards. This thesis funded in part by National Science Foundation Grant OCI-1127228, and describes the implementation of a new Globus Toolkit XIO driver which uses the RDMA Verbs API to drive network communication. Unlike other XIO drivers, the RDMA driver will not produce an intermediate copy of data prior to transmission over the network, therefore taking advantage of the full capabilities of RDMA. The work presented here describes the motivation, implementation, and various challenges and solutions associated with implementing a new XIO driver for RDMA, as well as giving an overview of its effectiveness when compared with TCP.

[1]  Robert D. Russell,et al.  A Performance Study to Guide RDMA Programming Decisions , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.

[2]  Sabine Richling,et al.  A long-distance infiniband interconnection between two clusters in production use , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[3]  William E. Allcock,et al.  The globus extensible input/output system (XIO): a protocol independent IO system for the grid , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[4]  William E. Allcock,et al.  The Globus Striped GridFTP Framework and Server , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[5]  Hemal Shah,et al.  Direct Data Placement over Reliable Transports , 2007, RFC.

[6]  Jeffrey S. Vetter,et al.  RXIO: Design and implementation of high performance RDMA-capable GridFTP , 2012, Comput. Electr. Eng..

[7]  Abhay K. Bhushan,et al.  The File Transfer Protocol , 1971, Request for Comments.

[8]  Brian Tierney,et al.  Efficient data transfer protocols for big data , 2012, 2012 IEEE 8th International Conference on E-Science.

[9]  Nageswara S. V. Rao,et al.  Experimental evaluation of infiniband transport over local- and wide-area networks , 2007, SpringSim '07.

[10]  Renato Recio,et al.  A Remote Direct Memory Access Protocol Specification , 2007, RFC.

[11]  D. Martin Swany,et al.  Evaluating High Performance Data Transfer with RDMA-based Protocols in Wide-Area Networks , 2012, 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems.

[12]  Bogdan M. Wilamowski,et al.  The Transmission Control Protocol , 2005, The Industrial Information Technology Handbook.

[13]  D. Martin Swany,et al.  Phoebus: A system for high throughput data movement , 2011, J. Parallel Distributed Comput..

[14]  Ian T. Foster,et al.  Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, Journal of Computer Science and Technology.

[15]  Dhabaleswar K. Panda,et al.  High Performance Data Transfer in Grid Environment Using GridFTP over InfiniBand , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[16]  Jon Postel,et al.  User Datagram Protocol , 1980, RFC.

[17]  Steven Tuecke,et al.  GridFTP: Protocol Extensions to FTP for the Grid , 2001 .

[18]  Amith R. Mamidala,et al.  Designing Efficient FTP Mechanisms for High Performance Data-Transfer over InfiniBand , 2009, 2009 International Conference on Parallel Processing.