Employing rule mining and multi-objective search for dynamic test case prioritization

Abstract Test case prioritization (TP) is widely used in regression testing for optimal reordering of test cases to achieve specific criteria (e.g., higher fault detection capability) as early as possible. In our earlier work, we proposed an approach for black-box dynamic TP using rule mining and multi-objective search (named as REMAP) by defining two objectives (fault detection capability and test case reliance score) and considering test case execution results at runtime. In this paper, we conduct an extensive empirical evaluation of REMAP by employing three different rule mining algorithms and three different multi-objective search algorithms, and we also evaluate REMAP with one additional objective (estimated execution time) for a total of 18 different configurations (i.e., 3 rule mining algorithms ×  3 search algorithms ×  2 different set of objectives) of REMAP. Specifically, we empirically evaluated the 18 variants of REMAP with 1) two variants of random search while using two objectives and three objectives, 2) three variants of greedy algorithm based on one objective, two objectives, and three objectives, 3) 18 variants of static search-based prioritization approaches, and 4) six variants of rule-based prioritization approaches using two industrial and three open source case studies. Results showed that the two best variants of REMAP with two objectives and three objectives significantly outperformed the best variants of competing approaches by 84.4% and 88.9%, and managed to achieve on average 14.2% and 18.8% higher Average Percentage of Faults Detected per Cost (APFDc) scores.

[1]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[2]  Neelam Gupta,et al.  A concept analysis inspired greedy algorithm for test suite minimization , 2005, PASTE '05.

[3]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[4]  Arnaud Gotlieb,et al.  Multi-objective test prioritization in software product line testing: an industrial case study , 2014, SPLC.

[5]  Deepti Mishra,et al.  Test case prioritization: a systematic mapping study , 2012, Software Quality Journal.

[6]  Claire Le Goues,et al.  Using a probabilistic model to predict bug fixes , 2018, 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER).

[7]  S. Vijayarani,et al.  An Efficient Algorithm for Generating Classification Rules , 2011 .

[8]  Silvia Regina Vergilio,et al.  A Mutation and Multi-objective Test Data Generation Approach for Feature Testing of Software Product Lines , 2015, 2015 29th Brazilian Symposium on Software Engineering.

[9]  Jie Zhang,et al.  A Simple and Fast Hypervolume Indicator-Based Multiobjective Evolutionary Algorithm , 2015, IEEE Transactions on Cybernetics.

[10]  A. Dias-Neto,et al.  0006/2011 - Threats to Validity in Search-based Software Engineering Empirical Studies , 2011 .

[11]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .

[12]  Qingfu Zhang,et al.  Multiobjective evolutionary algorithms: A survey of the state of the art , 2011, Swarm Evol. Comput..

[13]  Gregg Rothermel,et al.  Incorporating varying test costs and fault severities into test case prioritization , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[14]  Mark Harman,et al.  Multi-objective Software Effort Estimation , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[15]  Mark Harman,et al.  Causal impact analysis for app releases in google play , 2016, SIGSOFT FSE.

[16]  Mohammed J. Zaki,et al.  Lazy Associative Classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[17]  Mark Harman,et al.  Regression testing minimization, selection and prioritization: a survey , 2012, Softw. Test. Verification Reliab..

[18]  Gordon Fraser,et al.  On Parameter Tuning in Search Based Software Engineering , 2011, SSBSE.

[19]  Gregg Rothermel,et al.  Prioritizing test cases for regression testing , 2000, ISSTA '00.

[20]  Mary Lou Soffa,et al.  TimeAware test suite prioritization , 2006, ISSTA '06.

[21]  Hadi Hemmati,et al.  A similarity-based approach for test case prioritization using historical failure data , 2015, 2015 IEEE 26th International Symposium on Software Reliability Engineering (ISSRE).

[22]  Markus Wagner,et al.  Approximation-Guided Evolutionary Multi-Objective Optimization , 2011, IJCAI.

[23]  H. B. Mann,et al.  On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .

[24]  Antonio Ruiz Cortés,et al.  Multi-objective test case prioritization in highly configurable systems: A case study , 2016, J. Syst. Softw..

[25]  Prabhat Hajela,et al.  Genetic search strategies in multicriterion optimal design , 1991 .

[26]  Tim Menzies,et al.  On the value of user preferences in search-based software engineering: A case study in software product lines , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[27]  Charu C. Aggarwal,et al.  Data Mining: The Textbook , 2015 .

[28]  O. J. Dunn Multiple Comparisons Using Rank Sums , 1964 .

[29]  Joseph Robert Horgan,et al.  A study of effective regression testing in practice , 1997, Proceedings The Eighth International Symposium on Software Reliability Engineering.

[30]  Shuai Wang,et al.  CBGA-ES: A Cluster-Based Genetic Algorithm with Elitist Selection for Supporting Multi-Objective Test Optimization , 2017, 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[31]  Waseem Shahzad,et al.  Feature subset selection using association rule mining and JRip classifier , 2013 .

[32]  A. Vargha,et al.  A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong , 2000 .

[33]  Mark Harman,et al.  Transformed Vargha-Delaney Effect Size , 2015, SSBSE.

[34]  Yan Li,et al.  A Practical Guide to Select Quality Indicators for Assessing Pareto-Based Search Algorithms in Search-Based Software Engineering , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[35]  Austen Rainer,et al.  Case Study Research in Software Engineering - Guidelines and Examples , 2012 .

[36]  Antonio J. Nebro,et al.  jMetal: A Java framework for multi-objective optimization , 2011, Adv. Eng. Softw..

[37]  Gregg Rothermel,et al.  Techniques for improving regression testing in continuous integration development environments , 2014, SIGSOFT FSE.

[38]  Jan Vanthienen,et al.  Software Defect Prediction Based on Association Rule Classification , 2010 .

[39]  Shuai Wang,et al.  Enhancing Test Case Prioritization in an Industrial Setting with Resource Awareness and Multi-objective Search , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[40]  Mark Harman,et al.  Search Algorithms for Regression Test Case Prioritization , 2007, IEEE Transactions on Software Engineering.

[41]  Morten Mossige,et al.  Reinforcement learning for automatic test case prioritization and selection in continuous integration , 2017, ISSTA.

[42]  Abdel Salam Sayyad,et al.  Pareto-optimal search-based software engineering (POSBSE): A literature survey , 2013, 2013 2nd International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE).

[43]  Mohammad Abdollahi Azgomi,et al.  An improved method for test case prioritization by incorporating historical test case data , 2012, Sci. Comput. Program..

[44]  Markus Wagner,et al.  Fast and effective multi-objective optimisation of wind turbine placement , 2013, GECCO '13.

[45]  Arnaud Gotlieb,et al.  Minimizing test suites in software product lines using weight-based genetic algorithms , 2013, GECCO '13.

[46]  Bo Qu,et al.  Test Case Prioritization for Black Box Testing , 2007, 31st Annual International Computer Software and Applications Conference (COMPSAC 2007).

[47]  Eckart Zitzler,et al.  Indicator-Based Selection in Multiobjective Search , 2004, PPSN.

[48]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[49]  Enrique Alba,et al.  AbYSS: Adapting Scatter Search to Multiobjective Optimization , 2008, IEEE Transactions on Evolutionary Computation.

[50]  Kichun Lee,et al.  Predictability-based collective class association rule mining , 2017, Expert Syst. Appl..

[51]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[52]  Adam A. Porter,et al.  A history-based test prioritization technique for regression testing in resource constrained environments , 2002, ICSE '02.

[53]  Hyunsook Do,et al.  Improving the effectiveness of test suite through mining historical data , 2014, MSR 2014.

[54]  Sergio A. Alvarez,et al.  Collaborative Recommendation via Adaptive Association Rule Mining , 2000 .

[55]  Gregg Rothermel,et al.  Test Case Prioritization: A Family of Empirical Studies , 2002, IEEE Trans. Software Eng..

[56]  Mary Jean Harrold,et al.  Recomputing Coverage Information to Assist Regression Testing , 2009, IEEE Transactions on Software Engineering.

[57]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[58]  Gregg Rothermel,et al.  Test case prioritization: an empirical study , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[59]  Wei-Tek Tsai,et al.  Regression testing in an industrial environment , 1998, CACM.

[60]  Durga Prasad Mohapatra,et al.  Test Case Prioritization Using Association Rule Mining and Business Criticality Test Value , 2016 .

[61]  Saeed Parsa,et al.  Incorporating Historical Test Case Performance Data and Resource Constraints into Test Case Prioritization , 2009, TAP@TOOLS.

[62]  Geoff Holmes,et al.  Generating Rule Sets from Model Trees , 1999, Australian Joint Conference on Artificial Intelligence.

[63]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[64]  Ladan Tahvildari,et al.  Size-Constrained Regression Test Case Selection Using Multicriteria Optimization , 2012, IEEE Transactions on Software Engineering.

[65]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[66]  Sebastián Ventura,et al.  An interpretable classification rule mining algorithm , 2013, Inf. Sci..

[67]  Cem Kaner Improving the maintainability of automated test suites , 1997 .

[68]  Robert Feldt,et al.  Automated System Testing Using Visual GUI Testing Tools: A Comparative Study in Industry , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[69]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .

[70]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[71]  Tao Xie,et al.  To Be Optimal or Not in Test-Case Prioritization , 2016, IEEE Transactions on Software Engineering.

[72]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[73]  Arnaud Gotlieb,et al.  Test Case Prioritization for Continuous Regression Testing: An Industrial Case Study , 2013, 2013 IEEE International Conference on Software Maintenance.

[74]  Shuai Wang,et al.  REMAP: Using Rule Mining and Multi-objective Search for Dynamic Test Case Prioritization , 2018, 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST).

[75]  Durga Prasad Mohapatra,et al.  Model Based Test Case Prioritization Using Association Rule Mining , 2015 .

[76]  Lionel C. Briand,et al.  A practical guide for using statistical tests to assess randomized algorithms in software engineering , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[77]  Carlos Ordonez Comparing association rules and decision trees for disease prediction , 2006, HIKM '06.

[78]  Andrea De Lucia,et al.  Hypervolume-Based Search for Test Case Prioritization , 2015, SSBSE.

[79]  Kalyanmoy Deb,et al.  MULTI-OBJECTIVE FUNCTION OPTIMIZATION USING NON-DOMINATED SORTING GENETIC ALGORITHMS , 1994 .

[80]  Shuai Wang,et al.  STIPI: Using Search to Prioritize Test Cases Based on Multi-objectives Derived from Industrial Practice , 2016, ICTSS.

[81]  Sergio A. Alvarez,et al.  Efficient Adaptive-Support Association Rule Mining for Recommender Systems , 2004, Data Mining and Knowledge Discovery.

[82]  Roberto J. Bayardo Brute-Force Mining of High-Confidence Classification Rules , 1997, KDD.

[83]  Abu Saleh Mohammad Mosa,et al.  A Study on Pubmed Search Tag Usage Pattern: Association Rule Mining of a Full-day Pubmed Query Log , 2013, BMC Medical Informatics and Decision Making.

[84]  Davis,et al.  Principles of Data Mining , 2001 .

[85]  Tao Xie,et al.  Time-aware test-case prioritization using integer linear programming , 2009, ISSTA.

[86]  Mark Harman,et al.  Empirical evaluation of pareto efficient multi-objective regression test case prioritisation , 2015, ISSTA.

[87]  Lior Rokach,et al.  Introduction to Knowledge Discovery and Data Mining , 2010, Data Mining and Knowledge Discovery Handbook.

[88]  Alexis Dinno,et al.  Nonparametric Pairwise Multiple Comparisons in Independent Groups using Dunn's Test , 2015 .

[89]  Qi Luo,et al.  How Do Static and Dynamic Test Case Prioritization Techniques Perform on Modern Software Systems? An Extensive Study on GitHub Projects , 2018, IEEE Transactions on Software Engineering.

[90]  Lothar Thiele,et al.  Comparison of Multiobjective Evolutionary Algorithms: Empirical Results , 2000, Evolutionary Computation.

[91]  Jun Cheng,et al.  A Fine-Grained Parallel Multi-objective Test Case Prioritization on GPU , 2013, SSBSE.

[92]  Wenhao Yu,et al.  Supplementary material , 2015 .

[93]  Gregg Rothermel,et al.  A safe, efficient regression test selection technique , 1997, TSEM.

[94]  Yves Le Traon,et al.  Comparing White-Box and Black-Box Test Prioritization , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[95]  Hyuncheol Park,et al.  Historical Value-Based Approach for Cost-Cognizant Test Case Prioritization to Improve the Effectiveness of Regression Testing , 2008, 2008 Second International Conference on Secure System Integration and Reliability Improvement.