iTree: efficiently discovering high-coverage configurations using interaction trees

Software configurability has many benefits, but it also makes programs much harder to test, as in the worst case the program must be tested under every possible configuration. One potential remedy to this problem is combinatorial interaction testing (CIT), in which typically the developer selects a strength t and then computes a covering array containing all t-way configuration option combinations. However, in a prior study we showed that several programs have important high-strength interactions (combinations of a subset of configuration options) that CIT is highly unlikely to generate in practice. In this paper, we propose a new algorithm called interaction tree discovery (iTree) that aims to identify sets of configurations to test that are smaller than those generated by CIT, while also including important high-strength interactions missed by practical applications of CIT. On each iteration of iTree, we first use low-strength CIT to test the program under a set of configurations, and then apply machine learning techniques to discover new interactions that are potentially responsible for any new coverage seen. By repeating this process, iTree builds up a set of configurations likely to contain key high-strength interactions. We evaluated iTree by comparing the coverage it achieves versus covering arrays and randomly generated configuration sets. Our results strongly suggest that iTree can identify high-coverage sets of configurations more effectively than traditional CIT or random sampling.

[1]  C. L. Mallows,et al.  Applying Design of Experiments to Software Testing , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[2]  Myra B. Cohen,et al.  Covering Arrays for Efficient Fault Characterization in Complex Configuration Spaces , 2006, IEEE Trans. Software Eng..

[3]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[4]  Bin Wang,et al.  Automated support for classifying software failure reports , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[5]  Charles J. Colbourn,et al.  Prioritized interaction testing for pair-wise coverage with seeding and constraints , 2006, Inf. Softw. Technol..

[6]  David Leon,et al.  Finding failures by cluster analysis of execution profiles , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[7]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[8]  Gregg Rothermel,et al.  Prioritizing test cases for regression testing , 2000, ISSTA '00.

[9]  David Leon,et al.  A comparison of coverage-based and distribution-based techniques for filtering and prioritizing test cases , 2003, 14th International Symposium on Software Reliability Engineering, 2003. ISSRE 2003..

[10]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[11]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[12]  Michael L. Fredman,et al.  The AETG System: An Approach to Testing Based on Combinatiorial Design , 1997, IEEE Trans. Software Eng..

[13]  Myra B. Cohen,et al.  Incremental covering array failure characterization in large configuration spaces , 2009, ISSTA.

[14]  Gregg Rothermel,et al.  Test case prioritization: an empirical study , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[15]  Adam A. Porter,et al.  Using symbolic evaluation to understand behavior in configurable software systems , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[16]  Gregg Rothermel,et al.  The impact of test suite granularity on the cost-effectiveness of regression testing , 2002, ICSE '02.

[17]  Myra B. Cohen,et al.  Configuration-aware regression testing: an empirical study of sampling and prioritization , 2008, ISSTA '08.

[18]  Douglas C. Schmidt,et al.  Skoll: A Process and Infrastructure for Distributed Continuous Quality Assurance , 2007, IEEE Transactions on Software Engineering.

[19]  D. Richard Kuhn,et al.  Software fault interactions and implications for software testing , 2004, IEEE Transactions on Software Engineering.

[20]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[21]  Yuriy Brun,et al.  Finding latent code errors via machine learning over program executions , 2004, Proceedings. 26th International Conference on Software Engineering.

[22]  Myra B. Cohen,et al.  Combinatorial Interaction Regression Testing: A Study of Test Case Generation and Prioritization , 2007, 2007 IEEE International Conference on Software Maintenance.

[23]  Robert Mandl,et al.  Orthogonal Latin squares: an application of experiment design to compiler testing , 1985, CACM.

[24]  J. Czerwonka Pairwise Testing in Real World Practical Extensions to Test Case Generators , 2006 .

[25]  C. L. Mallows,et al.  Applying design of experiments to software testing: experience report , 1997, ICSE '97.

[26]  David Leon,et al.  Tree-based methods for classifying software failures , 2004, 15th International Symposium on Software Reliability Engineering.

[27]  Jinyuan You,et al.  CLOPE: a fast and effective clustering algorithm for transactional data , 2002, KDD.

[28]  Alessandro Orso,et al.  Techniques for Classifying Executions of Deployed Software to Support Software Engineering Tasks , 2007, IEEE Transactions on Software Engineering.

[29]  M. J. Reilly,et al.  An investigation of the applicability of design of experiments to software testing , 2002, 27th Annual NASA Goddard/IEEE Software Engineering Workshop, 2002. Proceedings..

[30]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[31]  Mark Harman,et al.  Search Algorithms for Regression Test Case Prioritization , 2007, IEEE Transactions on Software Engineering.

[32]  Robert Brownlie,et al.  Robust testing of AT&T PMX/StarMAIL using OATS , 1992, AT&T Technical Journal.

[33]  Siddhartha R. Dalal,et al.  Model-based testing in practice , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[34]  Myra B. Cohen,et al.  Covering array sampling of input event sequences for automated gui testing , 2007, ASE.

[35]  Mark Harman,et al.  Clustering test cases to achieve effective and scalable prioritisation incorporating expert knowledge , 2009, ISSTA.