An Exploratory Study on Assessing the Impact of Environment Variations on the Results of Load Tests

Large-scale software systems like Amazon and healthcare.gov are used by thousands or millions of people every day. To ensure the quality of these systems, load testing is a required testing procedure in addition to the conventional functional testing techniques like unit and system integration testing. One of the important requirements of load testing is to create a field-like test environment. Unfortunately, this task is often very challenging due to reasons like security and rapid field updates. In this paper, we have conducted an exploratory study on the impact of environment variations on the results of load tests. We have run over 110 hours load tests, which examine the system's behavior under load with various changes (e.g., installing an antivirus program) to the targeted deployment environment. We call such load tests as environment-variation-based load tests. Case studies in three open source systems have shown that there is a clear performance impact on the system's performance due to these environment changes. Different scenarios react differently to the changes in the underlying computing resources. When predicting the performance of the system under environment changes that are not previously load tested, our ensemble models out-perform (24% - 94% better) the baseline models.

[1]  C. Murray Woodside,et al.  Using regression splines for software performance analysis , 2000, WOSP '00.

[2]  Abbas Heiat,et al.  Comparison of artificial neural network and regression models for estimating software development effort , 2002, Inf. Softw. Technol..

[3]  Lieven Eeckhout,et al.  Statistically rigorous java performance evaluation , 2007, OOPSLA.

[4]  Alberto Avritzer,et al.  Resilience Assessment Based on Performance Testing , 2012, Resilience Assessment and Evaluation of Computing Systems.

[5]  Ruud C. M. de Rooij,et al.  Chaos Engineering , 2017, IEEE Software.

[6]  Ian H. Witten,et al.  Stress-Testing General Purpose Digital Library Software , 2009, ECDL.

[7]  Andre B. Bondi Challenges with Applying Performance Testing Methods for Systems Deployed on Shared Environments with Indeterminate Competing Workloads: Position Paper , 2016, ICPE Companion.

[8]  Paolo Romano,et al.  Performance Modelling of Partially Replicated In-Memory Transactional Stores , 2014, 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems.

[9]  Laurie A. Williams,et al.  Improving Performance Requirements Specifications from Field Failure Reports , 2007, 15th IEEE International Requirements Engineering Conference (RE 2007).

[10]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[11]  Zhen Zhang,et al.  Position paper: cloud-based performance testing: issues and challenges , 2013, HotTopiCS '13.

[12]  Chen Fu,et al.  Is Data Privacy Always Good for Software Testing? , 2010, 2010 IEEE 21st International Symposium on Software Reliability Engineering.

[13]  Marin Litoiu,et al.  A Framework to Evaluate the Effectiveness of Different Load Testing Analysis Techniques , 2016, 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[14]  Mithun Acharya,et al.  Mining Health Models for Performance Monitoring of Services , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[15]  Hailong Sun,et al.  Building a TaaS Platform for Web Service Load Testing , 2012, 2012 IEEE International Conference on Cluster Computing.

[16]  Gene M. Leganza The Stress Test Tutorial , 1991, Int. CMG Conference.

[17]  Edward D. Lazowska,et al.  Quantitative system performance - computer system analysis using queueing network models , 1983, Int. CMG Conference.

[18]  Matthias Hauswirth,et al.  Why you should care about quantile regression , 2013, ASPLOS '13.

[19]  Stuart Macdonald,et al.  User Engagement in Research Data Curation , 2009, ECDL.

[20]  Matthias Hauswirth,et al.  Evaluating the accuracy of Java profilers , 2010, PLDI '10.

[21]  Günther Pernul,et al.  Trust and Privacy in Digital Business , 2004, Lecture Notes in Computer Science.

[22]  Thomas A. Limoncelli,et al.  Resilience Engineering: Learning to Embrace Failure , 2012, ACM Queue.

[23]  Paolo Romano,et al.  Enhancing Performance Prediction Robustness by Combining Analytical Modeling and Machine Learning , 2015, ICPE.

[24]  Hui Shi,et al.  Web Services Wind Tunnel: On Performance Testing Large-Scale Stateful Web Services , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[25]  Ahmed E. Hassan,et al.  Automated Detection of Performance Regressions Using Regression Models on Clustered Performance Counters , 2015, ICPE.

[26]  Philipp Leitner,et al.  Patterns in the Chaos—A Study of Performance Variation and Predictability in Public IaaS Clouds , 2014, ACM Trans. Internet Techn..

[27]  Calton Pu,et al.  vPerfGuard: an automated model-driven framework for application performance diagnosis in consolidated cloud environments , 2013, ICPE '13.

[28]  Marin Litoiu,et al.  Autonomic load-testing framework , 2011, ICAC '11.

[29]  Andrew J. May,et al.  A Comparison of Field-Based and Lab-Based Experiments to Evaluate User Experience of Personalised Mobile Devices , 2013, Adv. Hum. Comput. Interact..

[30]  Scott Barber Creating effective load models for performance testing with incomplete empirical data , 2004, Proceedings. Sixth IEEE International Workshop on Web Site Evolution.

[31]  Ahmed E. Hassan,et al.  A Survey on Load Testing of Large-Scale Software Systems , 2015, IEEE Transactions on Software Engineering.

[32]  Elaine J. Weyuker,et al.  The Automatic Generation of Load Test Suites and the Assessment of the Resulting Software , 1995, IEEE Trans. Software Eng..