Statistical detection of QoS violations based on CUSUM control charts

Currently software systems operate in highly dynamic contexts, and consequently they have to adapt their behavior in response to changes in their contexts or/and requirements. Existing approaches trigger adaptations after detecting violations in quality of service (QoS) requirements by just comparing observed QoS values to predefined thresholds without any statistical confidence or certainty. These threshold-based adaptation approaches may perform unnecessary adaptations, which can lead to severe shortcomings such as follow-up failures or increased costs. In this paper we introduce a statistical approach based on CUSUM control charts called AuDeQAV - Automated Detection of QoS Attributes Violations. This approach estimates at runtime a current status of the running system, and monitors its QoS attributes and provides early detection of violations in its requirements with a defined level of confidence. This enables timely intervention preventing undesired consequences from the violation or from inappropriate remediation. We validated our approach using a series of experiments and response time datasets from real-world web services.

[1]  Finn Arve Aagesen,et al.  An Autonomic Framework for Service Configuration , 2011 .

[2]  David Garlan,et al.  Rainbow: architecture-based self-adaptation with reusable infrastructure , 2004 .

[3]  Gerardo Canfora,et al.  An empirical comparison of methods to support QoS-aware service selection , 2010, PESOS '10.

[4]  Peter Steenkiste,et al.  Building self-adapting services using service-specific knowledge , 2005, HPDC-14. Proceedings. 14th IEEE International Symposium on High Performance Distributed Computing, 2005..

[5]  Zachary G. Stoumbos,et al.  A CUSUM Chart for Monitoring a Proportion When Inspecting Continuously , 1999 .

[6]  A. Schmietendorf,et al.  Resource Metrics for Service-Oriented Infrastructures , 2007 .

[7]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[8]  Julie Waterhouse,et al.  Runtime monitoring of web service conversations , 2007, CASCON.

[9]  Douglas C. Montgomery,et al.  Applied statistics and probability for engineers / Douglas C. Montgomery, George C. Runger , 2003 .

[10]  Pengcheng Zhang,et al.  Monitoring probabilistic properties , 2009, ESEC/FSE '09.

[11]  Mohamed Jmaiel,et al.  Providing Predictive Self-Healing for Web Services: A QoS Monitoring and Analysis-based Approach , 2008 .

[12]  Lars Grunske,et al.  An effective sequential statistical test for probabilistic monitoring , 2011, Inf. Softw. Technol..

[13]  Radu Calinescu,et al.  Dynamic QoS Management and Optimization in Service-Based Systems , 2011, IEEE Transactions on Software Engineering.

[14]  Radu Calinescu,et al.  Using quantitative analysis to implement autonomic IT systems , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[15]  M. R. Reynolds,et al.  A general approach to modeling CUSUM charts for a proportion , 2000 .

[16]  Fred Spiring,et al.  Introduction to Statistical Quality Control , 2007, Technometrics.

[17]  Elisabeth J. Umble,et al.  Cumulative Sum Charts and Charting for Quality Improvement , 2001, Technometrics.

[18]  Davide Rossi,et al.  SLA-Driven Clustering of QoS-Aware Application Servers , 2007, IEEE Transactions on Software Engineering.

[19]  Raffaela Mirandola,et al.  A QoS-based framework for the adaptation of service-based systems , 2011, Scalable Comput. Pract. Exp..

[20]  Wolfgang Emmerich,et al.  The monitorability of service-level agreements for application-service provision , 2007, WOSP '07.

[21]  Ladan Tahvildari,et al.  StarMX: A framework for developing self-managing Java-based systems , 2009, 2009 ICSE Workshop on Software Engineering for Adaptive and Self-Managing Systems.

[22]  Robert D. Gibbons Use of combined Shewhart-CUSUM control charts for ground water monitoring applications. , 1999, Ground water.

[23]  Frank Eliassen,et al.  MUSIC: Middleware Support for Self-Adaptation in Ubiquitous and Service-Oriented Environments , 2009, Software Engineering for Self-Adaptive Systems.

[24]  Pengcheng Zhang,et al.  Monitoring of Probabilistic Timed Property Sequence Charts , 2011, Softw. Pract. Exp..

[25]  William H. Woodall,et al.  The State of Statistical Process Control as We Proceed into the 21st Century , 2000 .

[26]  Ladan Tahvildari,et al.  Self-adaptive software: Landscape and research challenges , 2009, TAAS.

[27]  Mary Shaw,et al.  Software Engineering for Self-Adaptive Systems: A Research Roadmap , 2009, Software Engineering for Self-Adaptive Systems.

[28]  Bradley R. Schmerl,et al.  Rainbow: Architecture-Based Self-Adaptation with Reusable Infrastructure , 2004, Computer.

[29]  Frank Schulz Towards Measuring the Degree of Fulfillment of Service Level Agreements , 2010, 2010 Third International Conference on Information and Computing.

[30]  Lars Grunske,et al.  Using Automated Control Charts for the Runtime Evaluation of QoS Attributes , 2011, 2011 IEEE 13th International Symposium on High-Assurance Systems Engineering.

[31]  Jos Nijhuis,et al.  DySOA: Making Service Systems Self-adaptive , 2005, ICSOC.

[32]  Bradley R. Schmerl,et al.  An architecture for coordinating multiple self-management systems , 2004, Proceedings. Fourth Working IEEE/IFIP Conference on Software Architecture (WICSA 2004).

[33]  Robert V. Brill,et al.  Applied Statistics and Probability for Engineers , 2004, Technometrics.

[34]  A. Luceño,et al.  The random intrinsic fast initial response of two-sided CUSUM charts , 2006 .

[35]  Wolfgang Emmerich,et al.  Efficient online monitoring of web-service SLAs , 2008, SIGSOFT '08/FSE-16.

[36]  W. Shewhart The Economic Control of Quality of Manufactured Product. , 1932 .

[37]  Bixin Li,et al.  Timed Property Sequence Chart , 2010, J. Syst. Softw..

[38]  Schahram Dustdar,et al.  Comprehensive QoS monitoring of Web services and event-based SLA violation detection , 2009, MWSOC '09.