DeltaSherlock: Identifying changes in the cloud

To track security and compliance requirements and perform problem diagnosis, administrators of cloud computing systems need to monitor significant system changes occurring on the set of cloud instances under their supervision. Considering the large number of instances (virtual machines, containers) possibly operating under multiple configurations, this is a difficult-to-track process. Standard solutions to this problem rely on manually-created rules to identify changes. These techniques suffer from a limited scope, rely on domain expertise, and are time-consuming and error-prone. Recently, more streamlined approaches that automatically determine the type of individual system changes have been proposed, but these techniques assume that system states right before and after each individual change can be captured, a rather difficult requirement to enforce in real world usage. This paper proposes DeltaSherlock, a practical system change discovery framework that can capture system states on-demand and detect multiple system changes between them. We evaluate DeltaSherlock over 25,000 system changes caused by software installations collected from virtual machines (VMs) deployed over a commercial cloud. DeltaSherlock can accurately identify multiple software installations with 96.8% accuracy when supplied with a non-overlapping record of system changes and with 77.8% accuracy when supplied with random irregular observations possibly containing overlapping or incomplete changes.

[1]  Saso Dzeroski,et al.  An extensive experimental comparison of methods for multi-label learning , 2012, Pattern Recognit..

[2]  Tianyin Xu,et al.  EnCore: exploiting system environment and correlation information for misconfiguration detection , 2014, ASPLOS.

[3]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[4]  Marios D. Dikaiakos,et al.  Minersoft: Software retrieval in grid and cloud computing infrastructures , 2012, TOIT.

[5]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[6]  Armando Fox,et al.  Fingerprinting the datacenter: automated classification of performance crises , 2010, EuroSys '10.

[7]  Armando Fox,et al.  Capturing, indexing, clustering, and retrieving system history , 2005, SOSP '05.

[8]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[9]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[10]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[11]  Ayse K. Coskun,et al.  Automated system change discovery and management in the cloud , 2016, IBM J. Res. Dev..

[12]  Shawn A. Bohner,et al.  Impact analysis in the software change process: a year 2000 perspective , 1996, 1996 Proceedings of International Conference on Software Maintenance.

[13]  Vasanth Bala,et al.  Detecting and identifying system changes in the cloud via discovery by example , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[14]  Wei-Ying Ma,et al.  Automated known problem diagnosis with event traces , 2006, EuroSys.

[15]  Darrell Reimer,et al.  Virtual Machine Images as Structured Data: The Mirage Image Library , 2011, HotCloud.

[16]  Everton Alvares Cherman,et al.  Multi-label Problem Transformation Methods: a Case Study , 2011, CLEI Electron. J..