Protocols for wide-area data-intensive applications: Design and performance issues

Providing high-speed data transfer is vital to various data-intensive applications. While there have been remarkable technology advances to provide ultra-high-speed network bandwidth, existing protocols and applications may not be able to fully utilize the bare-metal bandwidth due to their inefficient design. We identify the same problem remains in the field of Remote Direct Memory Access (RDMA) networks. RDMA offloads TCP/IP protocols to hardware devices. However, its benefits have not been fully exploited due to the lack of efficient software and application protocols, in particular in wide-area networks. In this paper, we address the design choices to develop such protocols. We describe a protocol implemented as part of a communication middleware. The protocol has its flow control, connection management, and task synchronization. It maximizes the parallelism of RDMA operations. We demonstrate its performance benefit on various local and wide-area testbeds, including the DOE ANI testbed with RoCE links and InfiniBand links.

[1]  D. Martin Swany,et al.  Gravel: A Communication Library to Fast Path MPI , 2008, PVM/MPI.

[2]  Shudong Jin,et al.  Middleware Support for RDMA-based Data Transfer in Cloud Computing , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[3]  Sang-Hwa Chung,et al.  Implementation of an efficient RDMA mechanism tightly coupled with a TCP/IP offload engine , 2008, 2008 International Symposium on Industrial Embedded Systems.

[4]  Sayantan Sur,et al.  High Performance Design and Implementation of Nemesis Communication Layer for Two-Sided and One-Sided MPI Semantics in MVAPICH2 , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[5]  Jeffrey S. Vetter,et al.  Wide-area performance profiling of 10GigE and InfiniBand technologies , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[6]  Arkady Kanevsky,et al.  Remote Direct Memory Access over the Converged Enhanced Ethernet Fabric: Evaluating the Options , 2009, 2009 17th IEEE Symposium on High Performance Interconnects.

[7]  Neal Bierbaum MPI and embedded TCP/IP Gigabit Ethernet cluster computing , 2002, 27th Annual IEEE Conference on Local Computer Networks, 2002. Proceedings. LCN 2002..

[8]  Dhabaleswar K. Panda,et al.  RDMA over Ethernet — A preliminary study , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[9]  Jeffrey S. Vetter,et al.  Wide-area performance profiling of 10GigE and InfiniBand technologies , 2008, HiPC 2008.

[10]  Gustavo Alonso,et al.  Minimizing the Hidden Cost of RDMA , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.

[11]  Weikuan Yu,et al.  Performance of RDMA-capable storage protocols on wide-area network , 2008, 2008 3rd Petascale Data Storage Workshop.

[12]  Jeffrey S. Vetter,et al.  RXIO: Design and implementation of high performance RDMA-capable GridFTP , 2012, Comput. Electr. Eng..

[13]  Dhabaleswar K. Panda,et al.  Performance characterization of a 10-Gigabit Ethernet TOE , 2005, 13th Symposium on High Performance Interconnects (HOTI'05).

[14]  D. Martin Swany,et al.  Introducing gravel: An MPI companion library , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[15]  Dhabaleswar K. Panda,et al.  High Performance Data Transfer in Grid Environment Using GridFTP over InfiniBand , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[16]  Amith R. Mamidala,et al.  Designing Efficient FTP Mechanisms for High Performance Data-Transfer over InfiniBand , 2009, 2009 International Conference on Parallel Processing.