Machine Learning in Virtualization: Estimate a Virtual Machine's Working Set Size

Achieving high density of virtual machines on a node while maintaining their performance strongly depends on the correct calculation of a virtual machine's working set. Different strategies are applied to solve the problem. Some researchers interpret a virtual machine as an unpredictable memory consumer, while others try to introspect a guest OS's knowledge of memory pressure. This paper introduces a new approach to calculation of the working set size - regression analysis. The technique estimates the memory consumption using a set of virtualization events. In this investigation, we discuss a correlation between the working set size and virtualization events, demonstrate the applicability of the approach and state its limitations. The argued choice of mathematical instrumentation is given. The collecting of control and learning samples is described in details. The results of final evaluation demonstrate significant resource gain.