High performance virtual machine migration with RDMA over modern interconnects

One of the most useful features provided by virtual machine (VM) technologies is the ability to migrate running OS instances across distinct physical nodes. As a basis for many administration tools in modern clusters and data-centers, VM migration is desired to be extremely efficient to reduce both migration time and performance impact on hosted applications. Currently, most VM environments use the Socket interface and the TCP/IP protocol to transfer VM migration traffic. In this paper, we propose a high performance VM migration design by using RDMA (Remote Direct Memory Access). RDMA is a feature provided by many modern high speed interconnects that are currently being widely deployed in data-centers and clusters. By taking advantage of the low software overhead and the one-sided nature of RDMA, our design significantly improves the efficiency of VM migration. We also contribute a set of micro-benchmarks and application-level benchmark evaluations aimed at evaluating important metrics of VM migration. The evaluations using our prototype implementation over Xen and InfiniBand show that RDMA can drastically reduce the migration overhead: up to 80% on total migration time and up to 77% on application observed downtime.

[1]  Scott Devine,et al.  Disco: running commodity operating systems on scalable multiprocessors , 1997, TOCS.

[2]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[3]  Dhabaleswar K. Panda,et al.  High Performance RDMA-Based MPI Implementation over InfiniBand , 2003, ICS '03.

[4]  Dhabaleswar K. Panda,et al.  High performance RDMA-based MPI implementation over InfiniBand , 2003, ICS.

[5]  Andrew Warfield,et al.  Xen and the art of virtualization , 2003, SOSP '03.

[6]  Greg J. Regnier,et al.  TCP performance re-visited , 2003, 2003 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS 2003..

[7]  HarrisTim,et al.  Xen and the art of virtualization , 2003 .

[8]  Renato J. O. Figueiredo,et al.  A case for grid computing on virtual machines , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[9]  D. Panda,et al.  Evaluating the Impact of RDMA on Storage I/O over InfiniBand , 2004 .

[10]  Erich M. Nahum,et al.  Server Network Scalability and TCP Offload , 2005, USENIX Annual Technical Conference, General Track.

[11]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[12]  Dhabaleswar K. Panda,et al.  High performance support of parallel virtual file system (PVFS2) over Quadrics , 2005, ICS '05.

[13]  Willy Zwaenepoel,et al.  Diagnosing performance overheads in the xen virtual machine environment , 2005, VEE '05.

[14]  Scott Rixner,et al.  TCP offload through connection handoff , 2006, EuroSys.

[15]  Dhabaleswar K. Panda,et al.  A case for high performance computing with virtual machines , 2006, ICS '06.

[16]  Kohta Nakashima,et al.  Application of RDMA Data Transfer Mechanism over 10Gb Ethernet to Virtual Machine Migration , 2006 .

[17]  Leon Gommans,et al.  Seamless live migration of virtual machines over the MAN/WAN , 2006, Future Gener. Comput. Syst..

[18]  Dhabaleswar K. Panda,et al.  Nomad: migrating OS-bypass networks in virtual machines , 2007, VEE '07.

[19]  Christian Engelmann,et al.  Proactive fault tolerance for HPC with Xen virtualization , 2007, ICS '07.

[20]  Yutaka Miyake Security monitoring for high speed networks , 2007 .