Automated Performance Tuning for Highly-Configurable Software Systems

Performance is an important non-functional aspect of the software requirement. Modern software systems are highly-configurable and misconfigurations may easily cause performance issues. A software system that suffers performance issues may exhibit low program throughput and long response time. However, the sheer size of the configuration space makes it challenging for administrators to manually select and adjust the configuration options to achieve better performance. In this paper, we propose ConfRL, an approach to tune software performance automatically. The key idea of ConfRL is to use reinforcement learning to explore the configuration space by a trial-and-error approach and to use the feedback received from the environment to tune configuration option values to achieve better performance. To reduce the cost of reinforcement learning, ConfRL employs sampling, clustering, and dynamic state reduction techniques to keep states in a large configuration space manageable. Our evaluation of four real-world highly-configurable server programs shows that ConfRL can efficiently and effectively guide software systems to achieve higher long-term performance.

[1]  Yuting Zhang,et al.  Friendly virtual machines: leveraging a feedback-control model for application adaptation , 2005, VEE '05.

[2]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3]  Yixin Diao,et al.  Optimizing Quality of Service Using Fuzzy Control , 2002, DSOM.

[4]  Sven Apel,et al.  Finding Faster Configurations Using FLASH , 2018, IEEE Transactions on Software Engineering.

[5]  Mona Attariyan,et al.  AutoBash: improving configuration management with operating system causality analysis , 2007, SOSP.

[6]  Sam Malek,et al.  FUSION: a framework for engineering self-tuning self-adaptive software systems , 2010, FSE '10.

[7]  Cheng-Zhong Xu,et al.  A Reinforcement Learning Approach to Online Web Systems Auto-configuration , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.

[8]  Kang G. Shin,et al.  Adaptive control of virtualized resources in utility computing environments , 2007, EuroSys '07.

[9]  Tian Jiang,et al.  Discovering, reporting, and fixing performance bugs , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[10]  K. Shin,et al.  Performance Guarantees for Web Server End-Systems: A Control-Theoretical Approach , 2002, IEEE Trans. Parallel Distributed Syst..

[11]  Myra B. Cohen,et al.  Covering arrays for efficient fault characterization in complex configuration spaces , 2004, IEEE Transactions on Software Engineering.

[12]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[13]  Antti Valmari,et al.  The State Explosion Problem , 1996, Petri Nets.

[14]  Lui Sha,et al.  Online response time optimization of Apache web server , 2003, IWQoS'03.

[15]  Myra B. Cohen,et al.  Beyond the rainbow: self-adaptive failure avoidance in configurable systems , 2014, SIGSOFT FSE.

[16]  Shu Wang,et al.  Understanding and Auto-Adjusting Performance-Sensitive Configurations , 2018, ASPLOS.

[17]  Myra B. Cohen,et al.  PrefFinder: getting the right preference in configurable software systems , 2014, ASE.

[18]  Dongmei Zhang,et al.  Context-sensitive delta inference for identifying workload-dependent performance bottlenecks , 2013, ISSTA.

[19]  I-Hsin Chung,et al.  A Case Study Using Automatic Performance Tuning for Large-Scale Scientific Programs , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[20]  Tingting Yu,et al.  An Empirical Study on Performance Bugs for Highly Configurable Software Systems , 2016, ESEM.

[21]  J. Czerwonka Pairwise Testing in Real World Practical Extensions to Test Case Generators , 2006 .

[22]  Le Yi Wang,et al.  VCONF: a reinforcement learning approach to virtual machines auto-configuration , 2009, ICAC '09.

[23]  Henry Hoffmann,et al.  Dynamic knobs for responsive power-aware computing , 2011, ASPLOS XVI.

[24]  Michael O. Duff,et al.  Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.

[25]  Mona Attariyan,et al.  X-ray: Automating Root-Cause Diagnosis of Performance Anomalies in Production Software , 2012, OSDI.

[26]  Steven D. Gribble,et al.  Configuration Debugging as Search: Finding the Needle in the Haystack , 2004, OSDI.

[27]  Shan Lu,et al.  Understanding and detecting real-world performance bugs , 2012, PLDI.

[28]  Matthias Hauswirth,et al.  Catch me if you can: performance bug detection in the wild , 2011, OOPSLA '11.

[29]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[30]  Christian Kästner,et al.  Learning to sample: exploiting similarities across environments to learn performance models for configurable systems , 2018, ESEC/SIGSOFT FSE.

[31]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..