Predictive Test Selection

Change-based testing is a key component of continuous integration at Facebook. However, a large number of tests coupled with a high rate of changes committed to our monolithic repository make it infeasible to run all potentially-impacted tests on each change. We propose a new predictive test selection strategy which selects a subset of tests to exercise for each change submitted to the continuous integration system. The strategy is learned from a large dataset of historical test outcomes using basic machine learning techniques. Deployed in production, the strategy reduces the total infrastructure cost of testing code changes by a factor of two, while guaranteeing that over 95% of individual test failures and over 99.9% of faulty changes are still reported back to developers. The method we present here also accounts for the non-determinism of test outcomes, also known as test flakiness.

[1]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[2]  Darko Marinov,et al.  An extensive study of static regression test selection in modern software evolution , 2016, SIGSOFT FSE.

[3]  Josh Levenberg,et al.  Why Google stores billions of lines of code in a single repository , 2016, Commun. ACM.

[4]  Gregg Rothermel,et al.  A safe, efficient regression test selection technique , 1997, TSEM.

[5]  Ahmet Çelik,et al.  Regression test selection across JVM boundaries , 2017, ESEC/SIGSOFT FSE.

[6]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[7]  Darko Marinov,et al.  STARTS: STAtic regression test selection , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[8]  Lingming Zhang,et al.  Hybrid Regression Test Selection , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[9]  Darko Marinov,et al.  Practical regression test selection with dynamic file dependencies , 2015, ISSTA.

[10]  Frank Tip,et al.  Change impact analysis for object-oriented programs , 2001, PASTE '01.

[11]  Peter C. Rigby,et al.  Improving Test Effectiveness Using Test Executions History: An Industrial Experience Report , 2019, 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP).

[12]  Darko Marinov,et al.  DeFlaker: Automatically Detecting Flaky Tests , 2018, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[13]  John Micco,et al.  Taming Google-Scale Continuous Testing , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[14]  John Micco,et al.  The State of Continuous Integration Testing @Google , 2017 .

[15]  Darko Marinov,et al.  An empirical analysis of flaky tests , 2014, SIGSOFT FSE.