Network stack specialization for performance

Contemporary network stacks are masterpieces of generality, supporting a range of edge-node and middle-node functions. This generality comes at significant performance cost: current APIs, memory models, and implementations drastically limit the effectiveness of increasingly powerful hardware. Generality has historically been required to allow individual systems to perform many functions. However, as providers have scaled up services to support hundreds of millions of users, they have transitioned toward many thousands (or even millions) of dedicated servers performing narrow ranges of functions. We argue that the overhead of generality is now a key obstacle to effective scaling, making specialization not only viable, but necessary. This paper presents Sandstorm, a clean-slate userspace network stack that exploits knowledge of web server semantics, improving throughput over current off-the-shelf designs while retaining use of conventional operating-system and programming frameworks. Based on Netmap, our novel approach merges application and network-stack memory models, aggressively amortizes stack-internal TCP costs based on application-layer knowledge, tightly couples with the NIC event model, and exploits low-latency hardware access. We compare our approach to the FreeBSD and Linux network stacks with nginx as the web server, demonstrating ~3.5x throughput improvement, while experiencing low CPU utilization, linear scaling on multicore systems, and saturating current NIC hardware.

[1]  Carlisle M. Adams,et al.  X.509 Internet Public Key Infrastructure Online Certificate Status Protocol - OCSP , 1999, RFC.

[2]  Keir Fraser,et al.  Arsenic: a user-accessible gigabit Ethernet interface , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[3]  Khaled Elmeleegy,et al.  Overclocking the Yahoo!: CDN for faster web page loads , 2011, IMC '11.

[4]  Peter Druschel,et al.  A Scalable and Explicit Event Delivery Mechanism for UNIX , 1999, USENIX Annual Technical Conference, General Track.

[5]  Robert Tappan Morris,et al.  An Analysis of Linux Scalability to Many Cores , 2010, OSDI.

[6]  Bryan Cantrill,et al.  Dynamic Instrumentation of Production Systems , 2004, USENIX Annual Technical Conference, General Track.

[7]  Jon Crowcroft,et al.  Unikernels: library operating systems for the cloud , 2013, ASPLOS '13.

[8]  William J. Bolosky,et al.  Mach: A New Kernel Foundation for UNIX Development , 1986, USENIX Summer.

[9]  Chuck Lever,et al.  An analysis of the TUX web server , 2000 .

[10]  Mark Handley,et al.  Towards high performance virtual routers on commodity hardware , 2008, CoNEXT '08.

[11]  Moshe Bar Kernel Korner: kHTTPd, a Kernel-Based Web Server , 2000 .

[12]  L. Deri Improving Passive Packet Capture : Beyond Device Polling , 2003 .

[13]  Jonathan Lemon Kqueue - A Generic and Scalable Event Notification Facility , 2001, USENIX Annual Technical Conference, FREENIX Track.

[14]  Willy Zwaenepoel,et al.  IO-Lite: a unified I/O buffering and caching system , 1999, TOCS.

[15]  Robert Tappan Morris,et al.  Improving network connection locality on multicore systems , 2012, EuroSys '12.

[16]  Eunyoung Jeong,et al.  mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems , 2014, NSDI.

[17]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.

[18]  Larry L. Peterson,et al.  Fbufs: a high-bandwidth cross-domain transfer facility , 1994, SOSP '93.

[19]  Chuck Lever,et al.  Scalable Network I/O in Linux , 2000, USENIX Annual Technical Conference, FREENIX Track.

[20]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[21]  Luigi Rizzo,et al.  netmap: A Novel Framework for Fast Packet I/O , 2012, USENIX ATC.

[22]  Thu D. Nguyen,et al.  Implementing Network Protocols at User Level , 1993, SIGCOMM.

[23]  George Neville-Neil,et al.  The Design and Implementation of the FreeBSD Operating System , 2014 .

[24]  Byung-Gon Chun,et al.  Usenix Association 10th Usenix Symposium on Operating Systems Design and Implementation (osdi '12) 135 Megapipe: a New Programming Interface for Scalable Network I/o , 2022 .