PerfCI: A Toolchain for Automated Performance Testing during Continuous Integration of Python Projects

Software performance testing is an essential quality assurance mechanism that can identify optimization opportunities. Automating this process requires strong tool support, especially in the case of Continuous Integration (CI) where tests need to run completely automatically and it is desirable to provide developers with actionable feedback. A lack of existing tools means that performance testing is normally left out of the scope of CI. In this paper, we propose a toolchain - PerfCI - to pave the way for developers to easily set up and carry out automated performance testing under CI. Our toolchain is based on allowing users to (1) specify performance testing tasks, (2) analyze unit tests on a variety of python projects ranging from scripts to full-blown flask-based web services, by extending a performance analysis framework (VyPR) and (3) evaluate performance data to get feedback on the code. We demonstrate the feasibility of our toolchain by using it on a web service running at the Compact Muon Solenoid (CMS) experiment at the world's largest particle physics laboratory - CERN. Package. Source code, example and documentation of PerfCI are available: https://gitlab.cern.ch/omjaved/PerfCI. Tool demonstration can be viewed on YouTube: https://youtu.be/RDmXMKAlv7g. We also provide the data set used in the analysis: https://gitlab.cern.ch/omjaved/PerfCI-dataset.

[1]  David Daly,et al.  The Use of Change Point Detection to Identify Software Performance Regressions in a Continuous Integration System , 2020, ICPE.

[2]  Shan Lu,et al.  Understanding and detecting real-world performance bugs , 2012, PLDI.

[3]  Philipp Leitner,et al.  An Evaluation of Open-Source Software Microbenchmark Suites for Continuous Performance Assessment , 2018, 2018 IEEE/ACM 15th International Conference on Mining Software Repositories (MSR).

[4]  Wilhelm Hasselbring,et al.  PeASS: A Tool for Identifying Performance Changes at Code Level , 2019, 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[5]  Wilhelm Hasselbring,et al.  Including Performance Benchmarks into Continuous Integration to Enable DevOps , 2015, SOEN.

[6]  Tingting Yu,et al.  PerfLearner: Learning from Bug Reports to Understand and Generate Performance Test Frames , 2018, 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[7]  Joshua H Dawes A Python object-oriented framework for the CMS alignment and calibration data , 2017 .

[8]  Andreas Pfeiffer,et al.  VyPR2: A Framework for Runtime Verification of Python Web Services , 2019, TACAS.

[9]  Wolfgang Ahrendt,et al.  Testing Meets Static and Runtime Verification , 2018, 2018 IEEE/ACM 6th International FME Workshop on Formal Methods in Software Engineering (FormaliSE).

[10]  Gerardo Canfora,et al.  An empirical characterization of bad practices in continuous integration , 2020, Empirical Software Engineering.

[11]  Ezio Bartocci,et al.  Introduction to Runtime Verification , 2018, Lectures on Runtime Verification.

[12]  Giles Reger,et al.  Explaining Violations of Properties in Control-Flow Temporal Logic , 2019, RV.

[13]  Koushik Sen,et al.  WISE: Automated test generation for worst-case complexity , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[14]  Grady Booch,et al.  Object-Oriented Design with Applications , 1990 .

[15]  Murali Krishna Ramanathan,et al.  Efficient flow profiling for detecting performance bugs , 2016, ISSTA.