Performance Evaluation of Low Latency Communication Alternatives in a Containerized Cloud Environment

Minimized and stable communication latency opens up the possibility for various new cloud computing domains, ranging from industrial control to stateless network functions. In this paper we evaluate the related performance aspects of two kernel bypassing technologies, RDMA and DPDK, and plain kernel sockets in a containerized environment. We show that RDMA and DPDK can provide similar latency characteristics for short messages, while RDMA outperforms DPDK as the message size grows. We demonstrate that if CPU usage is a concern, plain UDP sockets are considerable alternatives to the bypassing solutions, however, if single digit microsecond latency is a must, a dedicated CPU is necessary even for RDMA. Finally, we show that a software based message reliability can provide the same, or in some cases even better latency characteristics than the hardware based solution of RDMA, and that noisy neighbors in a cloud environment may significantly impact the performance, even with dedicated CPU cores.

[1]  Ashish Gupta,et al.  The RAMCloud Storage System , 2015, ACM Trans. Comput. Syst..

[2]  Yiying Zhang,et al.  LITE Kernel RDMA Support for Datacenter Applications , 2017, SOSP.

[3]  Attila Korösi,et al.  Dataplane Specialization for High-performance OpenFlow Software Switching , 2016, SIGCOMM.

[4]  Guoqiang Hu,et al.  Cloud robotics: architecture, challenges and applications , 2012, IEEE Network.

[5]  Sue B. Moon,et al.  Exploring Low-Latency Interconnect for Scaling Out Software Routers , 2016, 2016 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB).

[6]  David G. Andersen,et al.  Design Guidelines for High Performance RDMA Systems , 2016, USENIX ATC.

[7]  Scott Shenker,et al.  Revisiting network support for RDMA , 2018, SIGCOMM.

[8]  Hyeontaek Lim,et al.  MICA: A Holistic Approach to Fast In-Memory Key-Value Storage , 2014, NSDI.

[9]  Franck Le,et al.  Stateless Network Functions: Breaking the Tight Coupling of State and Processing , 2017, NSDI.

[10]  Robert L. Grossman,et al.  Experiences in Design and Implementation of a High Performance Transport Protocol , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[11]  Gábor Németh,et al.  DAL: A Locality-Optimizing Distributed Shared Memory System , 2017, HotCloud.

[12]  Fengyuan Ren,et al.  SoftRDMA: Rekindling High Performance Software RDMA over Commodity Ethernet , 2017, APNet.

[13]  David G. Andersen,et al.  Using RDMA efficiently for key-value services , 2015, SIGCOMM 2015.

[14]  Pavan Balaji,et al.  Scalable connectionless RDMA over unreliable datagrams , 2015, Parallel Comput..

[15]  Kristian Sandström,et al.  Evaluating industrial applicability of virtualization on a distributed multicore platform , 2014, Proceedings of the 2014 IEEE Emerging Technology and Factory Automation (ETFA).

[16]  Lars Thiele,et al.  Wireless Communication for Factory Automation: an opportunity for LTE and 5G systems , 2016, IEEE Communications Magazine.

[17]  Daniel Raumer,et al.  MoonGen: A Scriptable High-Speed Packet Generator , 2014, Internet Measurement Conference.