Optimizing overlay-based virtual networking through optimistic interrupts and cut-through forwarding

Overlay-based virtual networking provides a powerful model for realizing virtual distributed and parallel computing systems with strong isolation, portability, and recoverability properties. However, in extremely high throughput and low latency networks, such overlays can suffer from bandwidth and latency limitations, which is of particular concern if we want to apply the model in HPC environments. Through careful study of an existing very high performance overlay-based virtual network system, we have identified two core issues limiting performance: delayed and/or excessive virtual interrupt delivery into guests, and copies between host and guest data buffers done during encapsulation. We respond with two novel optimizations: optimistic, timer-free virtual interrupt injection, and zero-copy cut-through data forwarding. These optimizations improve the latency and bandwidth of the overlay network on 10 Gbps interconnects, resulting in near-native performance for a wide range of microbenchmarks and MPI application benchmarks.

[1]  Constantinos Evangelinos,et al.  Cloud Computing for parallel Scientific HPC Applications: Feasibility of Running Coupled Atmosphere- , 2008 .

[2]  Xuxian Jiang,et al.  Virtual distributed environments in a shared infrastructure , 2005, Computer.

[3]  Peter A. Dinda,et al.  Transparent network services via a virtual traffic layer for virtual machines , 2007, HPDC '07.

[4]  Karsten Schwan,et al.  High performance and scalable I/O virtualization via self-virtualized devices , 2007, HPDC '07.

[5]  Peter A. Dinda,et al.  Minimal-overhead virtualization of a large scale supercomputer , 2011, VEE '11.

[6]  Yang Zhang,et al.  Optimizing Network I/O Virtualization with Efficient Interrupt Coalescing and Virtual Receive Side Scaling , 2011, 2011 IEEE International Conference on Cluster Computing.

[7]  Thomas E. Anderson,et al.  PCP: Efficient Endpoint Congestion Control , 2006, NSDI.

[8]  Dhabaleswar K. Panda,et al.  High Performance VMM-Bypass I/O in Virtual Machines , 2006, USENIX Annual Technical Conference, General Track.

[9]  Pierre St. Juste,et al.  On the design of scalable, self-configuring virtual networks , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[10]  George Bosilca,et al.  Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.

[11]  Peter A. Dinda,et al.  Towards Virtual Networks for Virtual Machine Grid Computing , 2004, Virtual Machine Research and Technology Symposium.

[12]  Peter A. Dinda,et al.  Investigating virtual passthrough I/O on commodity devices , 2009, OPSR.

[13]  T. S. Eugene Ng,et al.  The Impact of Virtualization on Network Performance of Amazon EC2 Data Center , 2010, 2010 Proceedings IEEE INFOCOM.

[14]  Ivan B. Ganev,et al.  Re-architecting VMMs for Multicore Systems : The Sidecore Approach , 2007 .

[15]  R. V. D. Wijngaart NAS Parallel Benchmarks Version 2.4 , 2022 .

[16]  Beng-Hong Lim,et al.  Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor , 2001, USENIX Annual Technical Conference, General Track.

[17]  Peter A. Dinda,et al.  VNET/P: bridging the cloud and high performance computing through fast overlay networking , 2012, HPDC '12.

[18]  Jack J. Dongarra,et al.  HPC Challenge Benchmark , 2011, Encyclopedia of Parallel Computing.

[19]  Peter A. Dinda,et al.  Increasing application performance in virtual environments through run-time inference and adaptation , 2005, HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005..

[20]  Larry L. Peterson,et al.  TCP Vegas: End to End Congestion Avoidance on a Global Internet , 1995, IEEE J. Sel. Areas Commun..

[21]  Jiqiang Liu,et al.  Analysis of Interrupt Coalescing Schemes for Receive-Livelock Problem in Gigabit Ethernet Network Hosts , 2008, 2008 IEEE International Conference on Communications.

[22]  Peter A. Dinda,et al.  Palacios and Kitten: New high performance operating systems for scalable virtualized and native supercomputing , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[23]  Peter J. Varman,et al.  mClock: Handling Throughput Variability for Hypervisor IO Scheduling , 2010, OSDI.

[24]  Khaled Salah,et al.  Performance analysis and comparison of interrupt-handling schemes in gigabit networks , 2007, Comput. Commun..

[25]  Alexandru Iosup,et al.  An Early Performance Analysis of Cloud Computing Services for Scientific Computing , 2008 .