Detecting similarities in virtual machine behavior for cloud monitoring using smoothed histograms

The growing size and complexity of cloud systems determine scalability issues for resource monitoring and management. While most existing solutions consider each Virtual Machine (VM) as a black box with independent characteristics, we embrace a new perspective where VMs with similar behaviors in terms of resource usage are clustered together. We argue that this new approach has the potential to address scalability issues in cloud monitoring and management. In this paper, we propose a technique to cluster VMs starting from the usage of multiple resources, assuming no knowledge of the services executed on them. This innovative technique models VMs behavior exploiting the probability histogram of their resources usage, and performs smoothing-based noise reduction and selection of the most relevant information to consider for the clustering process. Through extensive evaluation, we show that our proposal achieves high and stable performance in terms of automatic VM clustering, and can reduce the monitoring requirements of cloud systems.

[1]  Malgorzata Steinder,et al.  A scalable application placement controller for enterprise data centers , 2007, WWW '07.

[2]  Jaroslav Kautsky,et al.  Smoothed histogram modification for image processing , 1983, Comput. Vis. Graph. Image Process..

[3]  Claudia Canali,et al.  Automatic virtual machine clustering based on bhattacharyya distance for multi-cloud systems , 2013, MultiCloud '13.

[4]  D. Freedman,et al.  On the histogram as a density estimator:L2 theory , 1981 .

[5]  Claudia Canali,et al.  Automated clustering of VMs for scalable cloud monitoring and management , 2012, SoftCOM 2012, 20th International Conference on Software, Telecommunications and Computer Networks.

[6]  D. W. Scott On optimal and data based histograms , 1979 .

[7]  Arun Venkataramani,et al.  Black-box and Gray-box Strategies for Virtual Machine Migration , 2007, NSDI.

[8]  Karsten Schwan,et al.  Net-cohort: detecting and managing VM ensembles in virtualized data centers , 2012, ICAC '12.

[9]  Barbara Panicucci,et al.  Energy-Aware Autonomic Resource Allocation in Multitier Virtualized Environments , 2012, IEEE Transactions on Services Computing.

[10]  Chunyi Peng,et al.  An empirical analysis of similarity in virtual machine images , 2011, Middleware '11.

[11]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[12]  Claudia Canali,et al.  Exploiting ensemble techniques for automatic virtual machine clustering in cloud systems , 2013, Automated Software Engineering.

[13]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[14]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[15]  Zhenhuan Gong,et al.  PAC: Pattern-driven Application Consolidation for Efficient Cloud Computing , 2010, 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems.

[16]  Rajkumar Buyya,et al.  Adaptive threshold-based approach for energy-efficient consolidation of virtual machines in cloud data centers , 2010, MGC '10.

[17]  Yogesh Rathi,et al.  Image Segmentation Using Active Contours Driven by the Bhattacharyya Gradient Flow , 2007, IEEE Transactions on Image Processing.

[18]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2008, Information Retrieval.

[19]  Dave Durkee,et al.  Why cloud computing will never be free , 2010, ACM Queue.

[20]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[21]  David M. Eyers,et al.  IO Tetris: Deep Storage Consolidation for the Cloud via Fine-Grained Workload Analysis , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[22]  Sameh Elnikety,et al.  Performance Comparison of Middleware Architectures for Generating Dynamic Web Content , 2003, Middleware.

[23]  Francesco Lo Presti,et al.  An adaptive model for online detection of relevant state changes in Internet-based systems , 2012, Perform. Evaluation.

[24]  Kuo-Chin Fan,et al.  Multi-modal gray-level histogram modeling and decomposition , 2002, Image Vis. Comput..

[25]  Jerome A. Rolia,et al.  Resource pool management: Reactive versus proactive or let's be friends , 2009, Comput. Networks.

[26]  Mark S. Squillante,et al.  A Hierarchical Approach for the Resource Management of Very Large Cloud Platforms , 2013, IEEE Transactions on Dependable and Secure Computing.

[27]  Michele Colajanni,et al.  A Scalable Architecture for Real-Time Monitoring of Large Information Systems , 2012, 2012 Second Symposium on Network Cloud Computing and Applications.

[28]  Alexander Stage,et al.  Decision support for virtual machine reassignments in enterprise data centers , 2010, 2010 IEEE/IFIP Network Operations and Management Symposium Workshops.

[29]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[30]  Nagarajan Kandasamy,et al.  Power and performance management of virtualized computing environments via lookahead control , 2008, 2008 International Conference on Autonomic Computing.

[31]  G. Pierre,et al.  Predictability of Web-server traffic congestion , 2005, 10th International Workshop on Web Content Caching and Distribution (WCW'05).

[32]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.