Supervised Learning for Test Suit Selection in Continuous Integration

Continuous Integration is the process of merging code changes into a software project. Keeping the master branch always updated and unfailingly is very computationally expensive due to the number of tests and code that needs to be executed. The waiting times also increase the time required for debugging. This paper proposes a solution to reduce the execution time of the testing phase, by selecting only a subset of all the tests, given some code changes. This is accomplished by training a Machine Learning (ML) Classifier with features such as code/test files history fails, extension code files that tend to generate more errors during the testing phase, and others. The results obtained by the best ML classifier showed results comparable with the recent literature done in the same area. This model managed to reduce the median test execution time by nearly 10 minutes while maintaining 97% of recall. Additionally, the impact of innocent commits and flaky tests was taken into account and studied to understand a particular industrial context.

[1]  Satish Chandra,et al.  Predictive Test Selection , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[2]  Stephen R. Taylor,et al.  Bayesian cross validation for gravitational-wave searches in pulsar-timing array data , 2019, Monthly Notices of the Royal Astronomical Society.

[3]  Rob J. Hyndman,et al.  A note on the validity of cross-validation for evaluating autoregressive time series prediction , 2018, Comput. Stat. Data Anal..

[4]  Gregg Rothermel,et al.  Analyzing Regression Test Selection Techniques , 1996, IEEE Trans. Software Eng..

[5]  Ranjita Bhagwan,et al.  FastLane: Test Minimization for Rapidly Deployed Large-Scale Online Services , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[6]  Aleksandrs Slivkins,et al.  Contextual Bandits with Similarity Information , 2009, COLT.

[7]  John Micco,et al.  Taming Google-Scale Continuous Testing , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[8]  Bertrand Meyer,et al.  Is Branch Coverage a Good Measure of Testing Effectiveness? , 2010, LASER Summer School.

[9]  Rui Abreu,et al.  An Empirical Study on the Use of Defect Prediction for Test Case Prioritization , 2019, 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST).

[10]  Sercan O. Arik,et al.  TabNet: Attentive Interpretable Tabular Learning , 2019, AAAI.

[11]  Paulo Macedo Pereira Analysis of Network Attacks and Security Events using Modern Data Visualization Techniques , 2015 .