Adaptive random testing: an illusion of effectiveness?

Adaptive Random Testing (ART) has been proposed as an enhancement to random testing, based on assumptions on how failing test cases are distributed in the input domain. The main assumption is that failing test cases are usually grouped into contiguous regions. Many papers have been published in which ART has been described as an effective alternative to random testing when using the average number of test case executions needed to find a failure (F-measure). But all the work in the literature is based either on simulations or case studies with unreasonably high failure rates. In this paper, we report on the largest empirical analysis of ART in the literature, in which 3727 mutated programs and nearly ten trillion test cases were used. Results show that ART is highly inefficient even on trivial problems when accounting for distance calculations among test cases, to an extent that probably prevents its practical use in most situations. For example, on the infamous Triangle Classification program, random testing finds failures in few milliseconds whereas ART execution time is prohibitive. Even when assuming a small, fixed size test set and looking at the probability of failure (P-measure), ART only fares slightly better than random testing, which is not sufficient to make it applicable in realistic conditions. We provide precise explanations of this phenomenon based on rigorous empirical analyses. For the simpler case of single-dimension input domains, we also perform formal analyses to support our claim that ART is of little use in most situations, unless drastic enhancements are developed. Such analyses help us explain some of the empirical results and identify the components of ART that need to be improved to make it a viable option in practice.

[1]  Zhi Quan Zhou Using Coverage Information to Guide Test Case Selection in Adaptive Random Testing , 2010, 2010 IEEE 34th Annual Computer Software and Applications Conference Workshops.

[2]  James Miller,et al.  A Novel Evolutionary Approach for Adaptive Random Testing , 2009, IEEE Transactions on Reliability.

[3]  G. B. Finelli,et al.  NASA Software failure characterization experiments , 1991 .

[4]  Johannes Mayer,et al.  Towards the determination of typical failure patterns , 2007, SOQUA '07.

[5]  Yuanyuan Zhang,et al.  Search Based Software Engineering: A Comprehensive Analysis and Review of Trends Techniques and Applications , 2009 .

[6]  A. Jefferson Offutt,et al.  MuJava: an automated class mutation system , 2005, Softw. Test. Verification Reliab..

[7]  K. Goulden,et al.  Effect Sizes for Research: A Broad Practical Approach , 2006 .

[8]  Tsong Yueh Chen,et al.  Mirror adaptive random testing , 2004, Inf. Softw. Technol..

[9]  Michael D. Ernst,et al.  Feedback-Directed Random Test Generation , 2007, 29th International Conference on Software Engineering (ICSE'07).

[10]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[11]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[12]  Lee J. White,et al.  A Domain Strategy for Computer Program Testing , 1980, IEEE Transactions on Software Engineering.

[13]  Huai Liu,et al.  Distributing test cases more evenly in adaptive random testing , 2008, J. Syst. Softw..

[14]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[15]  Lionel C. Briand,et al.  Black-Box System Testing of Real-Time Embedded Systems Using Random and Search-Based Testing , 2010, ICTSS.

[16]  Myron Lipow,et al.  Software reliability : a study of large project reality , 1978 .

[17]  William Feller,et al.  An Introduction to Probability Theory and Its Applications , 1967 .

[18]  Simeon C. Ntafos,et al.  An Evaluation of Random Testing , 1984, IEEE Transactions on Software Engineering.

[19]  Huai Liu,et al.  Enhancing adaptive random testing for programs with high dimensional input domains or failure-unrelated parameters , 2008, Software Quality Journal.

[20]  Paul Ammann,et al.  Data Diversity: An Approach to Software Fault Tolerance , 1988, IEEE Trans. Computers.

[21]  Bertrand Meyer,et al.  ARTOO , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[22]  Jianjun Zhao,et al.  A Divergence-Oriented Approach to Adaptive Random Testing of Java Programs , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[23]  Johannes Mayer,et al.  An empirical analysis and comparison of random testing techniques , 2006, ISESE '06.

[24]  I. K. Mak,et al.  Adaptive Random Testing , 2004, ASIAN.

[25]  Lionel C. Briand,et al.  Using Mutation Analysis for Assessing and Comparing Testing Coverage Criteria , 2006, IEEE Transactions on Software Engineering.

[26]  Glenford J. Myers,et al.  Art of Software Testing , 1979 .

[27]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .

[28]  Ki-Hyun Jung,et al.  Normalized Adaptive Random Test for Integration Tests , 2010, 2010 IEEE 34th Annual Computer Software and Applications Conference Workshops.

[29]  Tsong Yueh Chen,et al.  Adaptive Random Testing: The ART of test case diversity , 2010, J. Syst. Softw..

[30]  William H. Press,et al.  Numerical Recipes 3rd Edition: The Art of Scientific Computing , 2007 .

[31]  Lionel C. Briand,et al.  Reducing the Cost of Model-Based Testing through Test Case Diversity , 2010, ICTSS.

[32]  Mark Harman,et al.  Search Algorithms for Regression Test Case Prioritization , 2007, IEEE Transactions on Software Engineering.

[33]  Rami Bahsoon,et al.  Empirical comparison of regression test selection algorithms , 2001, J. Syst. Softw..

[34]  Tsong Yueh Chen,et al.  An upper bound on software testing effectiveness , 2008, TSEM.

[35]  Tsong Yueh Chen,et al.  Enhanced lattice-based adaptive random testing , 2009, SAC '09.

[36]  Fei-Ching Kuo,et al.  An Indepth Study of Mirror Adaptive Random Testing , 2009, 2009 Ninth International Conference on Quality Software.

[37]  Lionel C. Briand,et al.  Formal analysis of the effectiveness and predictability of random testing , 2010, ISSTA '10.

[38]  Johannes Mayer,et al.  Lattice-based adaptive random testing , 2005, ASE.

[39]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[40]  T. H. Tse,et al.  Adaptive Random Test Case Prioritization , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[41]  Tsong Yueh Chen,et al.  On Favourable Conditions for Adaptive Random Testing , 2007, Int. J. Softw. Eng. Knowl. Eng..

[42]  Huai Liu,et al.  Application of a Failure Driven Test Profile in Random Testing , 2009, IEEE Trans. Reliab..

[43]  Mark Harman,et al.  Meta-heuristic Search Algorithms for Regression Test Case Prioritization , 2007 .

[44]  Koushik Sen DART: Directed Automated Random Testing , 2009, Haifa Verification Conference.

[45]  Anura P. Jayasumana,et al.  Antirandom Testing: A Distance-Based Approach , 2008, VLSI Design.

[46]  Lionel C. Briand,et al.  A practical guide for using statistical tests to assess randomized algorithms in software engineering , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[47]  Huai Liu,et al.  Adaptive random testing based on distribution metrics , 2009, J. Syst. Softw..