NetKernel: Making Network Stack Part of the Virtualized Infrastructure

This paper presents a system called NetKernel that decouples the network stack from the guest virtual machine and offers it as an independent module. NetKernel represents a new paradigm where network stack can be managed as part of the virtualized infrastructure. It provides important efficiency benefits: By gaining control and visibility of the network stack, operator can perform network management more directly and flexibly, such as multiplexing VMs running different applications to the same network stack module to save CPU cores, and enforcing fair bandwidth sharing with distributed congestion control. Users also benefit from the simplified stack deployment and better performance. For example mTCP can be deployed without API change to support nginx natively, and shared memory networking can be readily enabled to improve performance of colocated VMs. Testbed evaluation using 100G NICs shows that NetKernel preserves the performance and scalability of both kernel and userspace network stacks, and provides the same isolation as the current architecture.

[1]  Ben Y. Zhao,et al.  Packet-Level Telemetry in Large Datacenter Networks , 2015, SIGCOMM.

[2]  Mark Handley,et al.  Network stack specialization for performance , 2013, HotNets.

[3]  A. Rowstron,et al.  Towards predictable datacenter networks , 2011, SIGCOMM.

[4]  Michael M. Swift,et al.  Titan: Fair Packet Scheduling for Commodity Multiqueue NICs , 2017, USENIX ATC.

[5]  W. Marsden I and J , 2012 .

[6]  Yongqiang Xiong,et al.  Network Stack as a Service in the Cloud , 2017, HotNets.

[7]  Thomas E. Anderson,et al.  Slim: OS Kernel Support for a Low-Overhead Container Overlay Network , 2019, NSDI.

[8]  David L. Black,et al.  Microkernel operating system architecture and Mach , 1991 .

[9]  Albert G. Greenberg,et al.  EyeQ: Practical Network Performance Isolation at the Edge , 2013, NSDI.

[10]  Keqiang He,et al.  Presto: Edge-based Load Balancing for Fast Datacenter Networks , 2015, SIGCOMM.

[11]  Kushagra Vaid,et al.  Azure Accelerated Networking: SmartNICs in the Public Cloud , 2018, NSDI.

[12]  Gautam Kumar,et al.  pHost: distributed near-optimal datacenter transport over commodity network fabric , 2015, CoNEXT.

[13]  Keqiang He,et al.  AC/DC TCP: Virtual Congestion Control Enforcement for Datacenter Networks , 2016, SIGCOMM.

[14]  Srinivasan Seshan,et al.  FreeFlow: Software-based Virtual RDMA Networking for Containerized Clouds , 2019, NSDI.

[15]  Yongqiang Xiong,et al.  Protego: Cloud-Scale Multitenant IPsec Gateway , 2017, USENIX Annual Technical Conference.

[16]  Christoforos E. Kozyrakis,et al.  IX: A Protected Dataplane Operating System for High Throughput and Low Latency , 2014, OSDI.

[17]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[18]  Eunyoung Jeong,et al.  mTCP: a Highly Scalable User-level TCP Stack for Multicore Systems , 2014, NSDI.

[19]  Michael Stumm,et al.  FlexSC: Flexible System Call Scheduling with Exception-Less System Calls , 2010, OSDI.

[20]  Amin Vahdat,et al.  Snap: a microkernel approach to host networking , 2019, SOSP.

[21]  Jennifer Rexford,et al.  CLOVE: How I learned to stop worrying about the core and love the edge , 2016, HotNets.

[22]  Hua Chen,et al.  Pingmesh: A Large-Scale System for Data Center Network Latency Measurement and Analysis , 2015, SIGCOMM.

[23]  Hari Balakrishnan,et al.  The Case for Moving Congestion Control Out of the Datapath , 2017, HotNets.

[24]  Amin Vahdat,et al.  Carousel: Scalable Traffic Shaping at End Hosts , 2017, SIGCOMM.

[25]  Albert G. Greenberg,et al.  Sharing the Data Center Network , 2011, NSDI.

[26]  Yu Chen,et al.  Scalable Kernel TCP Design and Implementation for Short-Lived Connections , 2016, ASPLOS.

[27]  Nan Hua,et al.  Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization , 2018, NSDI.

[28]  Timothy Roscoe,et al.  Arrakis , 2014, OSDI.

[29]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[30]  Van Jacobson,et al.  BBR: Congestion-Based Congestion Control , 2016, ACM Queue.

[31]  Giuseppe Lettieri,et al.  Speeding up packet I/O in virtual machines , 2013, Architectures for Networking and Communications Systems.

[32]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[33]  Nick McKeown,et al.  Virtualized Congestion Control , 2016, SIGCOMM.

[34]  Glenn Judd,et al.  Attaining the Promise and Avoiding the Pitfalls of TCP in the Datacenter , 2015, NSDI.

[35]  Ming Zhang,et al.  Congestion Control for Large-Scale RDMA Deployments , 2015, Comput. Commun. Rev..

[36]  Michio Honda,et al.  StackMap: Low-Latency Networking with the OS Stack and Dedicated NICs , 2016, USENIX Annual Technical Conference.

[37]  Hari Balakrishnan,et al.  Restructuring Endpoint Congestion Control , 2018, ANRW.

[38]  Ramesh Govindan,et al.  Trumpet: Timely and Precise Triggers in Data Centers , 2016, SIGCOMM.

[39]  Michio Honda,et al.  PASTE: A Network Programming Interface for Non-Volatile Main Memory , 2018, NSDI.

[40]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[41]  Amin Vahdat,et al.  TIMELY: RTT-based Congestion Control for the Datacenter , 2015, Comput. Commun. Rev..

[42]  Erez Zadok,et al.  To FUSE or Not to FUSE: Performance of User-Space File Systems , 2017, FAST.

[43]  Roberto Bifulco,et al.  ClickOS and the Art of Network Function Virtualization , 2014, NSDI.

[44]  I. Stoica,et al.  FairCloud: sharing the network in cloud computing , 2011, CCRV.

[45]  Vivek S. Pai,et al.  ModNet: A Modular Approach to Network Stack Extension , 2015, NSDI.

[46]  Scott Shenker,et al.  NetBricks: Taking the V out of NFV , 2016, OSDI.

[47]  Enhong Chen,et al.  Multi-Path Transport for RDMA in Datacenters , 2018, NSDI.

[48]  Li Chen,et al.  PIAS: Practical Information-Agnostic Flow Scheduling for Data Center Networks , 2014, HotNets.

[49]  Monia Ghobadi,et al.  HotCocoa: Hardware Congestion Control Abstractions , 2017, HotNets.

[50]  Daniel Firestone,et al.  VFP: A Virtual Switch Platform for Host SDN in the Public Cloud , 2017, NSDI.

[51]  Devavrat Shah,et al.  Fastpass , 2014, SIGCOMM.

[52]  Cong Xu,et al.  Iron: Isolating Network-based CPU in Container Environments , 2018, NSDI.

[53]  Ion Stoica,et al.  Confluo: Distributed Monitoring and Diagnosis Stack for High-speed Networks , 2019, NSDI.

[54]  Hari Balakrishnan,et al.  Cicada: Introducing Predictive Guarantees for Cloud Networks , 2014, HotCloud.

[55]  Jon Crowcroft,et al.  Unikernels: library operating systems for the cloud , 2013, ASPLOS '13.

[56]  K. K. Ramakrishnan,et al.  NetVM: High Performance and Flexible Networking Using Virtualization on Commodity Platforms , 2014, IEEE Transactions on Network and Service Management.

[57]  Sujata Banerjee,et al.  ElasticSwitch: practical work-conserving bandwidth guarantees for cloud computing , 2013, SIGCOMM.

[58]  Helen J. Wang,et al.  SecondNet: a data center network virtualization architecture with bandwidth guarantees , 2010, CoNEXT.