Experiences Implementing Live VM Migration over the WAN with Multi-Path TCP

Live VM Migration allows a running virtual machine or service to be moved from one host to another without the need to be shut down. This critical process offers many benefits for Internet services, including load balancing and service availability especially during host maintenance. Nevertheless, VM migration has been limited to layer 2 environments where the VM’s IP address can be migrated with the VM. This is because the IP address must remain reachable after the migration. This necessarily restricts the ability to migrate VMs, limiting their potential and utility.In this paper, we show how a new Internet standard, Multi-Path TCP, can be used to seamlessly migrate live VMs across WAN boundaries. This allows services to migrate closer to their clients while preserving active TCP connections, improving performance, responsiveness, and user engagement. We show this by designing and implementing LSM-MPTCP, a system for VM Migration over the WAN. We demonstrate how our approach can improve throughput and latency in a real cloud environment, achieving throughput improvements up to 6 times and reducing round-trip times by 99%. We also expose subtle networking issues related to migration that can heavily affect loss rates.

[1]  Tal Garfinkel,et al.  XvMotion: Unified Virtual Machine Migration over Long Distance , 2014, USENIX Annual Technical Conference.

[2]  Hai Jin,et al.  Live virtual machine migration with adaptive, memory compression , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[3]  Robbert van Renesse,et al.  Follow the Sun through the Clouds: Application Migration for Geographically Shifting Workloads , 2016, SoCC.

[4]  Feng Qian,et al.  MP-DASH: Adaptive Video Streaming Over Preference-Aware Multipath , 2016, CoNEXT.

[5]  Arun Venkataramani,et al.  Black-box and Gray-box Strategies for Virtual Machine Migration , 2007, NSDI.

[6]  Mark Handley,et al.  Architectural Guidelines for Multipath TCP Development , 2011, RFC.

[7]  Mark Handley,et al.  How Hard Can It Be? Designing and Implementing a Deployable Multipath TCP , 2012, NSDI.

[8]  Hai Jin,et al.  Optimizing the live migration of virtual machine by CPU scheduling , 2011, J. Netw. Comput. Appl..

[9]  Mark Handley,et al.  Is it still possible to extend TCP? , 2011, IMC '11.

[10]  Mark Handley,et al.  TCP Extensions for Multipath Operation with Multiple Addresses , 2020, RFC.

[11]  V. Rich Personal communication , 1989, Nature.

[12]  Prashant J. Shenoy,et al.  CloudNet: dynamic pooling of cloud resources by live WAN migration of virtual machines , 2011, VEE.

[13]  Hai Jin,et al.  Live migration of virtual machine based on full system trace and replay , 2009, HPDC '09.

[14]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[15]  Eyal de Lara,et al.  Low-bandwidth VM migration via opportunistic replay , 2008, HotMobile '08.

[16]  Costin Raiciu,et al.  Towards Wifi Mobility without Fast Handover , 2015, NSDI.

[17]  Leon Gommans,et al.  Seamless live migration of virtual machines over the MAN/WAN , 2006, Future Gener. Comput. Syst..

[18]  M. Rosenblum,et al.  Optimizing the migration of virtual computers , 2002, OSDI '02.

[19]  Donald F. Towsley,et al.  Modeling TCP throughput: a simple model and its empirical validation , 1998, SIGCOMM '98.

[20]  Marcelo Bagnulo,et al.  Evolving the internet with connection acrobatics , 2013, HotMiddlebox '13.

[21]  Mark Handley,et al.  TCP Extensions for Multipath Operation with Multiple Addresses , 2011 .

[22]  Yellu Sreenivasulu,et al.  FAST TRANSPARENT MIGRATION FOR VIRTUAL MACHINES , 2014 .

[23]  Anja Feldmann,et al.  Live wide-area migration of virtual machines including local persistent state , 2007, VEE '07.

[24]  Feng Qian,et al.  An anatomy of mobile web performance over multipath TCP , 2015, CoNEXT.