How to Measure the Killer Microsecond

Datacenter-networking research requires tools to both generate traffic and accurately measure latency and throughput. While hardware-based tools have long existed commercially, they are primarily used to validate ASICs and lack flexibility, e.g. to study new protocols. They are also too expensive for academics. The recent development of kernel-bypass networking and advanced NIC features such as hardware timestamping have created new opportunities for accurate latency measurements. This paper compares these two approaches, and in particular whether commodity servers and NICs, when properly configured, can measure the latency distributions as precisely as specialized hardware. Our work shows that well-designed commodity solutions can capture subtle differences in the tail latency of stateless UDP traffic. We use hardware devices as the ground truth, both to measure latency and to forward traffic. We compare the ground truth with observations that combine five latency-measuring clients and five different port forwarding solutions and configurations. State-of-the-art software such as MoonGen that uses NIC hardware timestamping provides sufficient visibility into tail latencies to study the effect of subtle operating system configuration changes. We also observe that the kernel-bypass-based TRex software, that only relies on the CPU to timestamp traffic, can also provide solid results when NIC timestamps are not available for a particular protocol or device.

[1]  Antonio Pescapè,et al.  Do you trust your software-based traffic generator? , 2010, IEEE Communications Magazine.

[2]  Van Jacobson,et al.  BBR: Congestion-Based Congestion Control , 2016, ACM Queue.

[3]  Seungjoon Lee,et al.  Network function virtualization: Challenges and opportunities for innovations , 2015, IEEE Communications Magazine.

[4]  Lingjia Tang,et al.  Treadmill: Attributing the Source of Tail Latency through Precise Load Testing and Statistical Inference , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[5]  David A. Patterson,et al.  Attack of the killer microseconds , 2017, Commun. ACM.

[6]  Scott O. Bradner,et al.  Benchmarking Methodology for Network Interconnect Devices , 1996, RFC.

[7]  Glen Gibb,et al.  NetFPGA—An Open Platform for Teaching How to Build Gigabit-Rate Network Switches and Routers , 2008, IEEE Transactions on Education.

[8]  Peter Sjödin,et al.  Pktgen: Measuring performance on high speed networks , 2016, Comput. Commun..

[9]  Daniel Raumer,et al.  MoonGen: A Scriptable High-Speed Packet Generator , 2014, Internet Measurement Conference.

[10]  Rob Sherwood,et al.  OFLOPS: An Open Framework for OpenFlow Switch Evaluation , 2012, PAM.

[11]  Amin Vahdat,et al.  TIMELY: RTT-based Congestion Control for the Datacenter , 2015, Comput. Commun. Rev..

[12]  Changhyun Lee,et al.  Accurate Latency-based Congestion Feedback for Datacenters , 2015, USENIX Annual Technical Conference.

[13]  Kang Lee,et al.  IEEE 1588 standard for a precision clock synchronization protocol for networked measurement and control systems , 2002, 2nd ISA/IEEE Sensors for Industry Conference,.

[14]  Monia Ghobadi,et al.  Caliper: a tool to generate precise and closed-loop traffic , 2010, SIGCOMM '10.

[15]  Marcin Wójcik,et al.  Where Has My Time Gone? , 2017, PAM.

[16]  Jialin Li,et al.  Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency , 2014, SoCC.

[17]  Luigi Rizzo,et al.  netmap: A Novel Framework for Fast Packet I/O , 2012, USENIX ATC.

[18]  Mendel Rosenblum,et al.  It's Time for Low Latency , 2011, HotOS.