Anatomy of Cloud Monitoring and Metering: A case study and open problems

Microservices based architecture has recently gained traction among the cloud service providers in quest for a more scalable and reliable modular architecture. In parallel with this architectural choice, cloud providers are also facing the market demand for fine grained usage based prices. Both the management of the microservices complex dependencies, as well as the fine grained metering require the providers to track and log detailed monitoring data from their deployed cloud setups. Hence, on one hand, the providers need to record all such performance changes and events, while on the other hand, they are concerned with the additional cost associated with the resources required to store and process this ever increasing amount of collected data. In this paper, we analyze the design of the monitoring subsystem provided by open source cloud solutions, such as OpenStack. Specifically, we analyze how the monitoring data is collected by OpenStack and assess the characteristics of the data it collects, aiming to pinpoint the limitations of the current approach and suggest alternate solutions. Our preliminary evaluation of the proposed solutions reveals that it is possible to reduce the monitored data size by up to 80% and missed anomaly detection rate from 3% to as low as 0.05% to 0.1%.

[1]  Vijay Mann,et al.  Problem Determination in Enterprise Middleware Systems using Change Point Correlation of Time Series Data , 2006, 2006 IEEE/IFIP Network Operations and Management Symposium NOMS 2006.

[2]  Steve Vinoski,et al.  Advanced Message Queuing Protocol , 2006, IEEE Internet Computing.

[3]  Christoph Fiehe,et al.  Scalable Monitoring System for Clouds , 2013, 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing.

[4]  Pushpraj Shukla,et al.  Efficient Constraint Monitoring Using Adaptive Thresholds , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[5]  Chris Rose,et al.  A Break in the Clouds: Towards a Cloud Definition , 2011 .

[6]  Claudia Canali,et al.  Automatic virtual machine clustering based on bhattacharyya distance for multi-cloud systems , 2013, MultiCloud '13.

[7]  Mahadev Satyanarayanan,et al.  Agentless Cloud-Wide Streaming of Guest File System Updates , 2014, 2014 IEEE International Conference on Cloud Engineering.

[8]  Xuhua Ding,et al.  On Trustworthiness of CPU Usage Metering and Accounting , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems Workshops.

[9]  Juha Röning,et al.  Improving the classification accuracy of streaming data using SAX similarity features , 2011, Pattern Recognit. Lett..

[10]  C. Pipper,et al.  [''R"--project for statistical computing]. , 2008, Ugeskrift for laeger.

[11]  Shicong Meng,et al.  Reliable State Monitoring in Cloud Datacenters , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[12]  Xuxian Jiang,et al.  "Out-of-the-Box" Monitoring of VM-Based High-Interaction Honeypots , 2007, RAID.

[13]  Ali Anwar,et al.  Scalable Metering for an Affordable IT Cloud Service Management , 2015, 2015 IEEE International Conference on Cloud Engineering.

[14]  Shicong Meng,et al.  Volley: Violation Likelihood Based State Monitoring for Datacenters , 2013, 2013 IEEE 33rd International Conference on Distributed Computing Systems.

[15]  Vyas Sekar,et al.  Towards verifiable resource accounting for outsourced computation , 2013, VEE '13.

[16]  Ali Anwar,et al.  Cost-Aware Cloud Metering with Scalable Service Management Infrastructure , 2015, 2015 IEEE 8th International Conference on Cloud Computing.

[17]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[18]  Li Zhao,et al.  Virtual platform architectures for resource metering in datacenters , 2009, PERV.