Exploratory Performance Testing Using Reinforcement Learning

Performance bottlenecks resulting in high response times and low throughput of software systems can ruin the reputation of the companies that rely on them. Almost two-thirds of performance bottlenecks are triggered on specific input values. However, finding the input values for performance test cases that can identify performance bottlenecks in a large-scale complex system within a reasonable amount of time is a cumbersome, cost-intensive, and time-consuming task. The reason is that there can be numerous combinations of test input values to explore in a limited amount of time. This paper presents PerfXRL, a novel approach for finding those combinations of input values that can reveal performance bottlenecks in the system under test. Our approach uses reinforcement learning to explore a large input space comprising combinations of input values and to learn to focus on those areas of the input space which trigger performance bottlenecks. The experimental results show that PerfxRL can detect 72% more performance bottlenecks than random testing by only exploring the 25% of the input space.

[1]  Lionel C. Briand,et al.  Stress testing real-time systems with genetic algorithms , 2005, GECCO '05.

[2]  Martin T. Hagan,et al.  Neural network design , 1995 .

[3]  Dawn Xiaodong Song,et al.  PerfFuzz: automatically generating pathological inputs , 2018, ISSTA.

[4]  Longxin Lin Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.

[5]  Shan Lu,et al.  Understanding and detecting real-world performance bugs , 2012, PLDI.

[6]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[7]  Michele Mazzucco Towards Autonomic Service Provisioning Systems , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[8]  Bertrand Meyer,et al.  Experimental assessment of random testing for object-oriented software , 2007, ISSTA '07.

[9]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[10]  Qi Luo,et al.  FOREPOST: finding performance problems automatically with feedback-directed learning software testing , 2017, Empirical Software Engineering.

[11]  Qi Luo,et al.  Automating performance bottleneck detection using search-based application profiling , 2015, ISSTA.

[12]  Dick Hamlet When only random testing will do , 2006, RT '06.

[13]  Dragos Truscan,et al.  Identifying worst-case user scenarios for performance testing of web applications using Markov-chain workload models , 2018, Future Gener. Comput. Syst..

[14]  Marc Peter Deisenroth,et al.  Deep Reinforcement Learning: A Brief Survey , 2017, IEEE Signal Processing Magazine.

[15]  C. Amza,et al.  Specification and implementation of dynamic Web site benchmarks , 2002, 2002 IEEE International Workshop on Workload Characterization.

[16]  Simeon C. Ntafos,et al.  An Evaluation of Random Testing , 1984, IEEE Transactions on Software Engineering.

[17]  Doreen Meier,et al.  Fundamentals Of Neural Networks Architectures Algorithms And Applications , 2016 .

[18]  Elaine J. Weyuker,et al.  Experience with Performance Testing of Software Systems: Issues, an Approach, and Case Study , 2000, IEEE Trans. Software Eng..

[19]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[22]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[23]  Yang Liu,et al.  Generating Performance Distributions via Probabilistic Symbolic Execution , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).