NetVM: High Performance and Flexible Networking Using Virtualization on Commodity Platforms

NetVM brings virtualization to the Network by enabling high bandwidth network functions to operate at near line speed, while taking advantage of the flexibility and customization of low cost commodity servers. NetVM allows customizable data plane processing capabilities such as firewalls, proxies, and routers to be embedded within virtual machines, complementing the control plane capabilities of Software Defined Networking. NetVM makes it easy to dynamically scale, deploy, and reprogram network functions. This provides far greater flexibility than existing purpose-built, sometimes proprietary hardware, while still allowing complex policies and full packet inspection to determine subsequent processing. It does so with dramatically higher throughput than existing software router platforms. NetVM is built on top of the KVM platform and Intel DPDK library. We detail many of the challenges we have solved such as adding support for high-speed inter-VM communication through shared huge pages and enhancing the CPU scheduler to prevent overheads caused by inter-core communication and context switching. NetVM allows true zero-copy delivery of data to VMs both for packet processing and messaging among VMs within a trust boundary. Our evaluation shows how NetVM can compose complex network functionality from multiple pipelined VMs and still obtain throughputs up to 10 Gbps, an improvement of more than 250% compared to existing techniques that use SR-IOV for virtualized networking.

[1]  Calton Pu,et al.  Efficient Packet Processing in User-Level OSes: A Study of UML , 2006, Proceedings. 2006 31st IEEE Conference on Local Computer Networks.

[2]  Eddie Kohler,et al.  The Click modular router , 1999, SOSP.

[3]  Ippokratis Pandis,et al.  NUMA-aware algorithms: the case of data shuffling , 2013, CIDR.

[4]  Brighten Godfrey,et al.  VeriFlow: verifying network-wide invariants in real time , 2012, HotSDN '12.

[5]  Muli Ben-Yehuda,et al.  IOMMU: strategies for mitigating the IOTLB bottleneck , 2010, ISCA'10.

[6]  Paul Lu,et al.  Shared-memory optimizations for virtual machines , 2011 .

[7]  Jeffrey Dean,et al.  Designs, Lessons and Advice from Building Large Distributed Systems , 2009 .

[8]  Calton Pu,et al.  Efficient Packet Processing in User-Level Operating Systems: A Study of UML , 2006 .

[9]  Roy H. Campbell,et al.  Context switch overheads for Linux on ARM platforms , 2007, ExpCS '07.

[10]  Cho-Li Wang,et al.  vBalance: using interrupt load balance to improve I/O performance for SMP virtual machines , 2012, SoCC '12.

[11]  Raffaele Bolla,et al.  Pc-based software routers: high performance and application service support , 2008, PRESTO '08.

[12]  Mark Handley,et al.  Flow processing and the rise of commodity network hardware , 2009, CCRV.

[13]  Roberto Bifulco,et al.  ClickOS and the Art of Network Function Virtualization , 2014, NSDI.

[14]  Wenji Wu,et al.  The performance analysis of linux networking - Packet receiving , 2007, Comput. Commun..

[15]  Sue B. Moon,et al.  The power of batching in the Click modular router , 2012, APSys.

[16]  Sangjin Han,et al.  PacketShader: a GPU-accelerated software router , 2010, SIGCOMM '10.

[17]  Minlan Yu,et al.  Software Defined Traffic Measurement with OpenSketch , 2013, NSDI.

[18]  Andrew Warfield,et al.  Split/Merge: System Support for Elastic Execution in Virtual Middleboxes , 2013, NSDI.

[19]  Jiuxing Liu,et al.  Virtualization polling engine (VPE): using dedicated CPU cores to accelerate I/O virtualization , 2009, ICS.

[20]  K. K. Ramakrishnan,et al.  Eliminating receive livelock in an interrupt-driven kernel , 1996, TOCS.

[21]  John Wack,et al.  Guidelines on Firewalls and Firewall Policy , 2002 .

[22]  Vyas Sekar,et al.  Design and Implementation of a Consolidated Middlebox Architecture , 2012, NSDI.

[23]  Martín Casado,et al.  Extending Networking into the Virtualization Layer , 2009, HotNets.

[24]  David Walker,et al.  Composing Software Defined Networks , 2013, NSDI.

[25]  Parameswaran Ramanathan,et al.  HIP: hybrid interrupt-polling for the network interface , 2001, OPSR.

[26]  Alan L. Cox,et al.  Hyper-Switch: A Scalable Software Virtual Switching Architecture , 2013, USENIX Annual Technical Conference.

[27]  Amin Vahdat,et al.  xOMB: Extensible Open MiddleBoxes with commodity servers , 2012, 2012 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS).

[28]  Li Zhao,et al.  Receive Side Coalescing for Accelerating TCP/IP Processing , 2006, HiPC.

[29]  Luigi Rizzo,et al.  netmap: A Novel Framework for Fast Packet I/O , 2012, USENIX Annual Technical Conference.

[30]  Katerina J. Argyraki,et al.  RouteBricks: exploiting parallelism to scale software routers , 2009, SOSP '09.

[31]  Hui Lu,et al.  vTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core , 2013, USENIX Annual Technical Conference.

[32]  Gil Neiger,et al.  Intel ® Virtualization Technology for Directed I/O , 2006 .

[33]  Jose Renato Santos,et al.  Bridging the Gap between Software and Hardware Techniques for I/O Virtualization , 2008, USENIX Annual Technical Conference.

[34]  Frank Hady,et al.  When poll is better than interrupt , 2012, FAST.

[35]  Chen Ding,et al.  Quantifying the cost of context switch , 2007, ExpCS '07.