Wait analysis of virtual machines using host kernel tracing

An agent-less method to understand virtual machines (VMs) behavior its evolution during the VM life-cycle is an essential task for IaaS provider. It allows the IaaS provider to better scale the VMs resources by properly allocating the physical resources. On the other hand, because of privacy, security, ease of deployment and execution overhead issues, the method presented limits its data collection to the physical host level, without internal access to the VMs. We propose a host-based, precise method to recover wait states for the virtual CPUs (vCPUs) of a given VM. The Wait Analysis Algorithm (W2A) computes the state of vCPUs through the host kernel trace. The state of vCPUs is displayed in an interactive trace viewer (TraceCompass) for further inspection. Our proposed VM trace analysis algorithm has been open-sourced for further enhancements and to the benefit of other developers. Our new technique is being evaluated with representative workloads, generated by different benchmarking tools. These approaches are based on host hypervisor tracing, which brings a lower overhead (around 0.03%) as compared to other approaches.

[1]  May El Barachi,et al.  Adaptive SLA-based elasticity management algorithms for a virtualized IP multimedia subsystem , 2014, 2014 IEEE Globecom Workshops (GC Wkshps).

[2]  M. Desnoyers,et al.  The LTTng tracer: A low impact performance and behavior monitor for GNU/Linux , 2006 .

[3]  Michel Dagenais,et al.  Wait Analysis of Distributed Systems Using Kernel Tracing , 2016, IEEE Transactions on Parallel and Distributed Systems.

[4]  Michel Dagenais,et al.  Multilayer Virtualized Systems Analysis with Kernel Tracing , 2016, 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW).

[5]  David Lie,et al.  Manitou: a layer-below approach to fighting malware , 2006, ASID '06.

[6]  Michel Dagenais,et al.  Low Overhead Hardware-Assisted Virtual Machine Analysis and Profiling , 2016, 2016 IEEE Globecom Workshops (GC Wkshps).

[7]  Michel Dagenais,et al.  Fine-Grained Nested Virtual Machine Performance Analysis through First Level Hypervisor Tracing , 2017, 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID).

[8]  Xiaohui Gu,et al.  Ieee Transactions on Parallel and Distributed Systems (tpds) Perfcompass: Online Performance Anomaly Fault Localization and Inference in Infrastructure-as-a-service Clouds , 2022 .

[9]  Ricardo Bianchini,et al.  DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments , 2013, USENIX Annual Technical Conference.

[10]  Michel Dagenais,et al.  Virtual CPU State Detection and Execution Flow Analysis by Host Tracing , 2016, 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom).

[11]  Andrea C. Arpaci-Dusseau,et al.  VMM-based hidden process detection and identification using Lycosid , 2008, VEE '08.

[12]  Michel Dagenais,et al.  Fine-grained preemption analysis for latency investigation across virtual machines , 2014, Journal of Cloud Computing.