Scalable correlation-aware virtual machine consolidation using two-phase clustering

Server consolidation is the most common and effective method to save energy and increase resource utilization in data centers, and virtual machine (VM) placement is the usual way of achieving server consolidation. VM placement is however challenging given the scale of IT infrastructures nowadays and the risk of resource contention among co-located VMs after consolidation. Therefore, the correlation among VMs to be co-located need to be considered. However, existing solutions do not address the scalability issue that arises once the number of VMs increases to an order of magnitude that makes it unrealistic to calculate the correlation between each pair of VMs. In this paper, we propose a correlation-aware VM consolidation solution ScalCCon1, which uses a novel two-phase clustering scheme to address the aforementioned scalability problem. We propose and demonstrate the benefits of using the two-phase clustering scheme in comparison to solutions using one-phase clustering (up to 84% reduction of execution time when 17, 446 VMs are considered). Moreover, our solution manages to reduce the number of physical machines (PMs) required, as well as the number of performance violations, compared to existing correlation-based approaches.

[1]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[2]  Umesh Bellur,et al.  Risk Aware Provisioning and Resource Aggregation Based Consolidation of Virtual Machines , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[3]  Eric Bouillet,et al.  Efficient resource provisioning in compute clouds via VM multiplexing , 2010, ICAC '10.

[4]  Gargi Dasgupta,et al.  Server Workload Analysis for Power Minimization using Consolidation , 2009, USENIX Annual Technical Conference.

[5]  Thanasis Loukopoulos,et al.  Application-Aware Workload Consolidation to Minimize Both Energy Consumption and Network Load in Cloud Environments , 2013, 2013 42nd International Conference on Parallel Processing.

[6]  James C. Bezdek,et al.  Some new indexes of cluster validity , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[7]  John Murphy,et al.  A Fair Comparison of VM Placement Heuristics and a More Effective Solution , 2014, 2014 IEEE 13th International Symposium on Parallel and Distributed Computing.

[8]  Arun Venkataramani,et al.  Black-box and Gray-box Strategies for Virtual Machine Migration , 2007, NSDI.

[9]  Ricardo Bianchini,et al.  DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments , 2013, USENIX Annual Technical Conference.

[10]  Akshat Verma,et al.  Virtual machine consolidation in the wild , 2014, Middleware.

[11]  David Atienza,et al.  Correlation-aware virtual machine allocation for energy-efficient datacenters , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[12]  Malika Charrad,et al.  NbClust: An R Package for Determining the Relevant Number of Clusters in a Data Set , 2014 .

[13]  Jie Liu,et al.  Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines , 2011, SoCC.

[14]  Rina Panigrahy,et al.  Validating Heuristics for Virtual Machines Consolidation , 2011 .

[15]  Qian Zhu,et al.  A Performance Interference Model for Managing Consolidated Workloads in QoS-Aware Clouds , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[16]  Calton Pu,et al.  An Analysis of Performance Interference Effects in Virtual Environments , 2007, 2007 IEEE International Symposium on Performance Analysis of Systems & Software.

[17]  Li Li,et al.  Joint power optimization of data center network and servers with correlation analysis , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[18]  Lizy Kurian John,et al.  Performance impact of virtual machine placement in a datacenter , 2012, 2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC).

[19]  William Fornaciari,et al.  Consolidation of multi-tier workloads with performance and reliability constraints , 2012, 2012 International Conference on High Performance Computing & Simulation (HPCS).

[20]  Xiaozhe Wang,et al.  Characteristic-Based Clustering for Time Series Data , 2006, Data Mining and Knowledge Discovery.

[21]  Qian Zhu,et al.  Power-Aware Consolidation of Scientific Workflows in Virtualized Environments , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[22]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[23]  Athanasios V. Vasilakos,et al.  Managing Performance Overhead of Virtual Machines in Cloud Computing: A Survey, State of the Art, and Future Directions , 2014, Proceedings of the IEEE.

[24]  Jie Liu,et al.  PACMan: Performance Aware Virtual Machine Consolidation , 2013, ICAC.