mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems

Scaling the performance of short TCP connections on multicore systems is fundamentally challenging. Although many proposals have attempted to address various shortcomings, inefficiency of the kernel implementation still persists. For example, even state-of-the-art designs spend 70% to 80% of CPU cycles in handling TCP connections in the kernel, leaving only small room for innovation in the user-level program. This work presents mTCP, a high-performance user-level TCP stack for multicore systems. mTCP addresses the inefficiencies from the ground up--from packet I/O and TCP connection management to the application interface. In addition to adopting well-known techniques, our design (1) translates multiple expensive system calls into a single shared memory reference, (2) allows efficient flow-level event aggregation, and (3) performs batched packet I/O for high I/O efficiency. Our evaluations on an 8-core machine showed that mTCP improves the performance of small message transactions by a factor of 25 compared to the latest Linux TCP stack and a factor of 3 compared to the best-performing research system known so far. It also improves the performance of various popular applications by 33% to 320% compared to those on the Linux stack.

[1]  Jochen Liedtke,et al.  The performance of μ-kernel-based systems , 1997, SOSP.

[2]  Katerina J. Argyraki,et al.  RouteBricks: exploiting parallelism to scale software routers , 2009, SOSP '09.

[3]  Pete Loshin Transmission Control Protocol , 2003 .

[4]  Stefan Savage,et al.  Alpine: A User-Level Infrastructure for Network Protocol Development , 2001, USITS.

[5]  Anant Agarwal,et al.  Factored operating systems (fos): the case for a scalable operating system for multicores , 2009, OPSR.

[6]  Sangjin Han,et al.  PacketShader: a GPU-accelerated software router , 2010, SIGCOMM '10.

[7]  Amin Vahdat,et al.  Chronos: predictable low latency for data center applications , 2012, SoCC '12.

[8]  Michael B. Jones,et al.  Mach: a system software kernel , 1989, Digest of Papers. COMPCON Spring 89. Thirty-Fourth IEEE Computer Society International Conference: Intellectual Leverage.

[9]  Michael Stumm,et al.  Exception-Less System Calls for Event-Driven Servers , 2011, USENIX Annual Technical Conference.

[10]  Michael Stumm,et al.  FlexSC: Flexible System Call Scheduling with Exception-Less System Calls , 2010, OSDI.

[11]  Vivek S. Pai,et al.  Towards understanding modern web traffic , 2011, SIGMETRICS '11.

[12]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[13]  Byung-Gon Chun,et al.  MegaPipe: A New Programming Interface for Scalable Network I/O , 2012, OSDI.

[14]  Sally Floyd,et al.  The NewReno Modification to TCP's Fast Recovery Algorithm , 2004, RFC.

[15]  Injong Rhee,et al.  CUBIC: a new TCP-friendly high-speed TCP variant , 2008, OPSR.

[16]  Robert Tappan Morris,et al.  Improving network connection locality on multicore systems , 2012, EuroSys '12.

[17]  Yang Zhang,et al.  Corey: An Operating System for Many Cores , 2008, OSDI.

[18]  Dawson R. Engler,et al.  Fast and flexible application-level networking on exokernel systems , 2002, TOCS.

[19]  Roy T. Fielding,et al.  The Apache HTTP Server Project , 1997, IEEE Internet Comput..

[20]  Seungyeop Han,et al.  SSLShader: Cheap SSL Acceleration with Commodity Processors , 2011, NSDI.

[21]  Luigi Rizzo,et al.  netmap: A Novel Framework for Fast Packet I/O , 2012, USENIX ATC.

[22]  Adrian Schüpbach,et al.  The multikernel: a new OS architecture for scalable multicore systems , 2009, SOSP '09.

[23]  Eunyoung Jeong,et al.  Comparison of caching strategies in modern cellular backhaul networks , 2013, MobiSys '13.

[24]  Andrea C. Arpaci-Dusseau,et al.  Deploying Safe User-Level Network Services with icTCP , 2004, OSDI.

[25]  Hyeontaek Lim,et al.  MICA: A Holistic Approach to Fast In-Memory Key-Value Storage , 2014, NSDI.

[26]  Thu D. Nguyen,et al.  Implementing Network Protocols at User Level , 1993, SIGCOMM.

[27]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[28]  David G. Andersen,et al.  The Case for VOS: The Vector Operating System , 2011, HotOS.

[29]  Jinyang Li,et al.  Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store , 2013, USENIX ATC.