Memory and Network Aware Scheduling of Virtual Machine Migrations

Live-migration has become a common operation on virtualized infrastructures. Indeed, it is widely used by resource management algorithms to distribute the load between servers and to reduce energy consumption. Operators rely also on migrations to prepare production servers for critical maintenance by relocating their running VMs elsewhere. To apply new VM placement decisions, live-migrations must be scheduled by selecting for each migration the moment to start and the bandwidth to allocate. Long migrations violate SLAs and reduce the practical benefits of placement algorithms. The VMs should then be migrated as fast as possible. To do so, the migration scheduler must be able to predict accurately the migration durations and schedule them accordingly. Dynamic VM placement algorithms focus extensively on computing a placement of quality. Their practical reactivity is however lowered by restrictive assumptions that underestimate the migration durations. For example, Entropy supposes a non-blocking homogeneous network coupled with a null dirty page rate and we already demonstrated that the network topology but also the workload live memory usage are dominating factors. Recently, some migration models have been developed and integrated into simulators to evaluate VM placement algorithms properly. While these models reproduce migrations finely, they are only devoted to simulation purpose and not used to compute scheduling decisions. We propose here a migration scheduler that considers the network topology, the migration routes, the VM memory usage and the dirty page rates, to compute precise migration durations and infer better schedules. We implemented our scheduler on top of BtrPlace, an extensible version of Entropy that allows to enrich the scheduling decision capabilities through plug-ins. To assess the flexibility of our scheduler, we also implemented constraints to synchronize migrations, to establish precedence rules, to respect power budgets and an objective that minimizes energy consumption. We evaluated our model accuracy and its resulting benefits by executing migration scenarios on a real testbed including a blocking network, mixed VM memory workloads and collocation settings. Our model predicted the migration durations with a 94% accuracy at minimum and an absolute error of 1 second while BtrPlace vanilla was only 30% accurate. This gain of precision led to wiser scheduling decisions. In practice, the migrations completed on average 3.5 time faster as compared to an execution based on BtrPlace vanilla. Thanks to a better control of migrations and power-switching actions we also reduced the power consumption of a server decommissioning scenario according to different power budgets.

[1]  Akshat Verma,et al.  Virtual machine consolidation in the wild , 2014, Middleware.

[2]  Fabien Hermenier,et al.  Planning Live-Migrations to Prepare Servers for Maintenance , 2014, Euro-Par Workshops.

[3]  Xavier Lorca,et al.  Entropy: a consolidation manager for clusters , 2009, VEE '09.

[4]  Takahiro Hirofuchi,et al.  Adding a Live Migration Model into SimGrid: One More Step Toward the Simulation of Infrastructure-as-a-Service Concerns , 2013, 2013 IEEE 5th International Conference on Cloud Computing Technology and Science.

[5]  Fabien Hermenier,et al.  BtrPlace: A Flexible Consolidation Manager for Highly Available Applications , 2013, IEEE Transactions on Dependable and Secure Computing.