Cloud Application Performance Monitoring

In this paper, we overview the design and implementation of a new approach to Application Performance Monitoring (APM) for Cloud Platforms-as-a-service (PaaS). Our approach couples and integrates full stack performance monitoring and analysis into the PaaS system itself for comprehensive introspection. To enable this, we employ lightweight and intelligent sensors and agents and “pluggable” data analysis modules that facilitate service level objectives (SLOs) for application response time, application-specific performance anomaly detection and root cause analysis, and workload change point detection. We implement our APM by combining the popular Elastic Stack with other common PaaS services in a way that is portable so that it can be integrated easily into public and private PaaS systems.

[1]  Lon-Mu Liu,et al.  Joint Estimation of Model Parameters and Outlier Effects in Time Series , 1993 .

[2]  Ulrike Groemping,et al.  Relative Importance for Linear Regression in R: The Package relaimpo , 2006 .

[3]  Frank M. Bereznay Did something change? using statistical techniques to interpret service and resource metrics , 2006, Int. CMG Conference.

[4]  Richard Wolski,et al.  QBETS: queue bounds estimation from time series , 2007, SIGMETRICS '07.

[5]  João Paulo Magalhães,et al.  Detection of Performance Anomalies in Web-Based Applications , 2010, 2010 Ninth IEEE International Symposium on Network Computing and Applications.

[6]  João Paulo Magalhães,et al.  Root-cause analysis of performance anomalies in web-based applications , 2011, SAC.

[7]  P. Fearnhead,et al.  Optimal detection of changepoints with a linear computational cost , 2011, 1101.1438.

[8]  Chandra Krintz,et al.  The AppScale Cloud Platform: Enabling Portable, Scalable Web Application Deployment , 2013, IEEE Internet Computing.

[9]  Tevfik Bultan,et al.  Cloud Platform Support for API Governance , 2014, 2014 IEEE International Conference on Cloud Engineering.

[10]  Iraklis Paraskakis,et al.  Towards a framework for monitoring cloud application platforms as sensor networks , 2014, Cluster Computing.

[11]  Antonio Corradi,et al.  Monitoring applications and services to improve the Cloud Foundry PaaS , 2014, 2014 IEEE Symposium on Computers and Communications (ISCC).

[12]  Ali Anwar,et al.  Anatomy of Cloud Monitoring and Metering: A case study and open problems , 2015, APSys.

[13]  Hiranya Jayathilaka,et al.  Service-Level Agreement Durability for Web Service Response Time , 2015, 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom).

[14]  Erik Elmroth,et al.  Performance Anomaly Detection and Bottleneck Identification , 2015, ACM Comput. Surv..

[15]  Hiranya Jayathilaka,et al.  Response time service level agreements for cloud-hosted web applications , 2015, SoCC.