How to Effectively Reduce Tens of Millions of Tests: An Industrial Case Study on Adaptive Random Testing

Running and analyzing a large number of tests in an industrial scenario is labor intensive and time consuming. Hence, it is necessary to select a smaller number of tests for cost reduction as well as fault detection. For a type of nonnumeric systems, the linear-order algorithm for adaptive random testing (ART) (LART) technique is proposed by making tests evenly spread in nonnumeric input domains. To further enhance LART in the industrial scenarios where the number of input categories is too large, a new technique called category selection-based ART (CSBART), in which partial categories are selected to calculate tests’ distances to guide LART, is proposed in this article. The fault-coverage effectiveness of CSBART is evaluated via an empirical study on two large scale billing systems with tens of millions of test cases, and the results demonstrate the promising performance of the proposed CSBART. We also find that, after category selection, CSBART can outperform a more complex and widespread n-per cluster sampling technique that uses K-means clustering to certain extents.

[1]  Baowen Xu,et al.  Measuring the Diversity of a Test Set With Distance Entropy , 2016, IEEE Transactions on Reliability.

[2]  Myra B. Cohen,et al.  An orchestrated survey of methodologies for automated software test case generation , 2013, J. Syst. Softw..

[3]  Robert G. Merkel,et al.  Analysis and enhancements of adaptive random testing , 2005 .

[4]  Lionel C. Briand,et al.  Adaptive random testing: an illusion of effectiveness? , 2011, ISSTA '11.

[5]  Bo Zhou,et al.  Enhancing Performance of Random Testing through Markov Chain Monte Carlo Methods , 2013, IEEE Trans. Computers.

[6]  Qing He,et al.  Parallel K-Means Clustering Based on MapReduce , 2009, CloudCom.

[7]  Joseph Robert Horgan,et al.  Data flow coverage and the C language , 1991, TAV4.

[8]  Mark Harman,et al.  Regression testing minimization, selection and prioritization: a survey , 2012, Softw. Test. Verification Reliab..

[9]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[10]  Yunwei Dong,et al.  Metamorphic testing as a test case selection strategy , 2016, Science China Information Sciences.

[11]  Tsong Yueh Chen,et al.  On Favourable Conditions for Adaptive Random Testing , 2007, Int. J. Softw. Eng. Knowl. Eng..

[12]  A Straw,et al.  Guide to the Software Engineering Body of Knowledge , 1998 .

[13]  Marc J. Balcer,et al.  The category-partition method for specifying and generating fuctional tests , 1988, CACM.

[14]  Huai Liu,et al.  A Cost-Effective Random Testing Method for Programs with Non-Numeric Inputs , 2016, IEEE Transactions on Computers.

[15]  Tadashi Dohi,et al.  Enhancing Performance of Random Testing through Markov Chain Monte Carlo Methods , 2010, IEEE Transactions on Computers.

[16]  Antonia Bertolino,et al.  Using Spanning Sets for Coverage Testing , 2003, IEEE Trans. Software Eng..

[17]  Dave Towey,et al.  Forgetting Test Cases , 2006, 30th Annual International Computer Software and Applications Conference (COMPSAC'06).

[18]  Jianjun Zhao,et al.  A Divergence-Oriented Approach to Adaptive Random Testing of Java Programs , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[19]  I. K. Mak,et al.  Adaptive Random Testing , 2004, ASIAN.

[20]  Neelam Gupta,et al.  A concept analysis inspired greedy algorithm for test suite minimization , 2005, PASTE '05.

[21]  George Mason,et al.  Procedures for Reducing the Size of Coverage-based Test Sets , 1995 .

[22]  Dave Towey,et al.  A revisit of three studies related to random testing , 2015, Science China Information Sciences.

[23]  Dave Towey,et al.  Restricted Random Testing , 2002, ECSQ.

[24]  J. R. Horgan,et al.  A data flow coverage testing tool for C , 1992, [1992] Proceedings of the Second Symposium on Assessment of Quality Software Development Tools.

[25]  W. Eric Wong,et al.  Effect of test set minimization on fault detection effectiveness , 1998 .

[26]  Tsong Yueh Chen,et al.  An upper bound on software testing effectiveness , 2008, TSEM.

[27]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[28]  David Leon,et al.  Finding failures by cluster analysis of execution profiles , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[29]  Gregg Rothermel,et al.  Empirical studies of test‐suite reduction , 2002, Softw. Test. Verification Reliab..

[30]  Ziyuan Wang Improved Metrics for Non-Classic Test Prioritization Problems , 2015, SEKE.

[31]  Baowen Xu,et al.  Comparing logic coverage criteria on test case prioritization , 2012, Science China Information Sciences.

[32]  Bertrand Meyer,et al.  ARTOO , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[33]  Eric S. Raymond,et al.  The Art of Unix Programming , 2003 .

[34]  Tony Gorschek,et al.  A Model for Technology Transfer in Practice , 2006, IEEE Software.

[35]  Tsong Yueh Chen,et al.  On the statistical properties of testing effectiveness measures , 2006, J. Syst. Softw..

[36]  Bernhard Schölkopf,et al.  Use of the Zero-Norm with Linear Models and Kernel Methods , 2003, J. Mach. Learn. Res..