Learning to Prioritize Test Programs for Compiler Testing

Compiler testing is a crucial way of guaranteeing the reliability of compilers (and software systems in general). Many techniques have been proposed to facilitate automated compiler testing. These techniques rely on a large number of test programs (which are test inputs of compilers) generated by some test-generation tools (e.g., CSmith). However, these compiler testing techniques have serious efficiency problems as they usually take a long period of time to find compiler bugs. To accelerate compiler testing, it is desirable to prioritize the generated test programs so that the test programs that are more likely to trigger compiler bugs are executed earlier. In this paper, we propose the idea of learning to test, which learns the characteristics of bug-revealing test programs from previous test programs that triggered bugs. Based on the idea of learning to test, we propose LET, an approach to prioritizing test programs for compiler testing acceleration. LET consists of a learning process and a scheduling process. In the learning process, LET identifies a set of features of test programs, trains a capability model to predict the probability of a new test program for triggering compiler bugs and a time model to predict the execution time of a test program. In the scheduling process, LET prioritizes new test programs according to their bug-revealing probabilities in unit time, which is calculated based on the two trained models. Our extensive experiments show that LET significantly accelerates compiler testing. In particular, LET reduces more than 50% of the testing time in 24.64% of the cases, and reduces between 25% and 50% of the testing time in 36.23% of the cases.

[1]  Zhendong Su,et al.  Randomized stress-testing of link-time optimizers , 2015, ISSTA.

[2]  Lu Zhang,et al.  Test-case prioritization: achievements and challenges , 2016, Frontiers of Computer Science.

[3]  Laurie A. Williams,et al.  Predicting failures with developer networks and social network analysis , 2008, SIGSOFT '08/FSE-16.

[4]  Tsong Yueh Chen,et al.  Adaptive Random Testing: The ART of test case diversity , 2010, J. Syst. Softw..

[5]  Yogendra Kumar Jain,et al.  Min Max Normalization Based Data Perturbation Method for Privacy Protection , 2011 .

[6]  Gregg Rothermel,et al.  A Static Approach to Prioritizing JUnit Test Cases , 2012, IEEE Transactions on Software Engineering.

[7]  Qi Luo,et al.  A large-scale empirical comparison of static and dynamic test case prioritization techniques , 2016, SIGSOFT FSE.

[8]  Mark Harman,et al.  Test prioritization using system models , 2005, 21st IEEE International Conference on Software Maintenance (ICSM'05).

[9]  Ahmed E. Hassan,et al.  Predicting faults using the complexity of code changes , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[10]  Xuejun Yang,et al.  Test-case reduction for C compiler bugs , 2012, PLDI.

[11]  Lu Zhang,et al.  Test Case Prioritization for Compilers: A Text-Vector Based Approach , 2016, 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[12]  Hyuncheol Park,et al.  Historical Value-Based Approach for Cost-Cognizant Test Case Prioritization to Improve the Effectiveness of Regression Testing , 2008, 2008 Second International Conference on Secure System Integration and Reliability Improvement.

[13]  Bart Baesens,et al.  Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings , 2008, IEEE Transactions on Software Engineering.

[14]  Atsushi Hashimoto,et al.  Scaling up Size and Number of Expressions in Random Testing of Arithmetic Optimization of C Compilers , 2013 .

[15]  Ying Zou,et al.  Cross-Project Defect Prediction Using a Connectivity-Based Unsupervised Classifier , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[16]  Alastair F. Donaldson,et al.  Automatic Test Case Reduction for OpenCL , 2016, IWOCL.

[17]  Mark Harman,et al.  Search Algorithms for Regression Test Case Prioritization , 2007, IEEE Transactions on Software Engineering.

[18]  Mary Jean Harrold,et al.  Test-Suite Reduction and Prioritization for Modified Condition/Decision Coverage , 2003, IEEE Trans. Software Eng..

[19]  Jin Liu,et al.  Dictionary learning based software defect prediction , 2014, ICSE.

[20]  Tu Minh Phuong,et al.  Topic-based defect prediction. , 2011, ICSE 2011.

[21]  Gregg Rothermel,et al.  An empirical study of regression testing techniques incorporating context and lifetime factors and improved cost-benefit models , 2006, SIGSOFT '06/FSE-14.

[22]  Tao Xie,et al.  To Be Optimal or Not in Test-Case Prioritization , 2016, IEEE Transactions on Software Engineering.

[23]  Gregg Rothermel,et al.  Prioritizing test cases for regression testing , 2000, ISSTA '00.

[24]  Adam A. Porter,et al.  A history-based test prioritization technique for regression testing in resource constrained environments , 2002, ICSE '02.

[25]  Gregg Rothermel,et al.  Test Case Prioritization: A Family of Empirical Studies , 2002, IEEE Trans. Software Eng..

[26]  Christian Lindig,et al.  Random testing of C calling conventions , 2005, AADEBUG'05.

[27]  Alastair F. Donaldson,et al.  Many-core compiler fuzzing , 2015, PLDI.

[28]  T. H. Tse,et al.  Adaptive Random Test Case Prioritization , 2009, 2009 IEEE/ACM International Conference on Automated Software Engineering.

[29]  Tao Xie,et al.  Time-aware test-case prioritization using integer linear programming , 2009, ISSTA.

[30]  W. M. McKeeman,et al.  Differential Testing for Software , 1998, Digit. Tech. J..

[31]  Zhendong Su,et al.  Compiler validation via equivalence modulo inputs , 2014, PLDI.

[32]  Myra B. Cohen,et al.  Combinatorial Interaction Regression Testing: A Study of Test Case Generation and Prioritization , 2007, 2007 IEEE International Conference on Software Maintenance.

[33]  Guangchun Luo,et al.  Transfer learning for cross-company software defect prediction , 2012, Inf. Softw. Technol..

[34]  I. K. Mak,et al.  Adaptive Random Testing , 2004, ASIAN.

[35]  Olivier Barais,et al.  NOTICE: A Framework for Non-Functional Testing of Compilers , 2016, 2016 IEEE International Conference on Software Quality, Reliability and Security (QRS).

[36]  Alex Groce,et al.  Taming compiler fuzzers , 2013, PLDI.

[37]  Flash Sheridan Practical testing of a C99 compiler using output comparison , 2007 .

[38]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[39]  Paolo Tonella,et al.  Using the Case-Based Ranking Methodology for Test Case Prioritization , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[40]  P. H. Testing an Optimising Compiler by Generating Random Lambda Terms , 2012 .

[41]  Jun Li,et al.  High-confidence software evolution , 2016, Science China Information Sciences.

[42]  K. V. Hanford,et al.  Automatic Generation of Test Cases , 1970, IBM Syst. J..

[43]  Lu Zhang,et al.  Supporting oracle construction via static analysis , 2016, 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE).

[44]  Zhendong Su,et al.  Finding compiler bugs via live code mutation , 2016, OOPSLA.

[45]  A. Zeller,et al.  Predicting Defects for Eclipse , 2007, Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007).

[46]  Gregg Rothermel,et al.  Incorporating varying test costs and fault severities into test case prioritization , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[47]  Witold Pedrycz,et al.  A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[48]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[49]  Ayse Basar Bener,et al.  On the relative value of cross-company and within-company data for defect prediction , 2009, Empirical Software Engineering.

[50]  Richard L. Sauder,et al.  A general test data generator for COBOL , 1962, AIEE-IRE '62 (Spring).

[51]  Liang Guo,et al.  Automated test program generation for an industrial optimizing compiler , 2009, 2009 ICSE Workshop on Automation of Software Test.

[52]  Mary Lou Soffa,et al.  TimeAware test suite prioritization , 2006, ISSTA '06.

[53]  Laurie A. Williams,et al.  Towards the prioritization of system test cases , 2014, Softw. Test. Verification Reliab..

[54]  Gregg Rothermel,et al.  Cost-cognizant Test Case Prioritization , 2006 .

[55]  Wei Wu,et al.  An Automatic Testing Approach for Compiler Based on Metamorphic Testing Technique , 2010, 2010 Asia Pacific Software Engineering Conference.

[56]  T. H. Tse,et al.  Test case prioritization for regression testing of service-oriented business applications , 2009, WWW '09.

[57]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[58]  Gregg Rothermel,et al.  Test case prioritization: an empirical study , 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). 'Software Maintenance for Business Change' (Cat. No.99CB36360).

[59]  Zhendong Su,et al.  Finding and Analyzing Compiler Warning Defects , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[60]  Song Wang,et al.  Automatically Learning Semantic Features for Defect Prediction , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[61]  Tao Xie,et al.  Quota-constrained test-case prioritization for regression testing of service-centric systems , 2008, 2008 IEEE International Conference on Software Maintenance.

[62]  Jianhua Dai,et al.  Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification , 2013, Appl. Soft Comput..

[63]  Lu Zhang,et al.  How Do Assertions Impact Coverage-Based Test-Suite Reduction? , 2017, 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST).

[64]  Tu Minh Phuong,et al.  Topic-based defect prediction: NIER track , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[65]  Hoh Peter In,et al.  Micro interaction metrics for defect prediction , 2011, ESEC/FSE '11.

[66]  Zhendong Su,et al.  Finding deep compiler bugs via guided stochastic program mutation , 2015, OOPSLA.

[67]  Zhendong Su,et al.  Toward understanding compiler bugs in GCC and LLVM , 2016, ISSTA.

[68]  Gregg Rothermel,et al.  Using sensitivity analysis to create simplified economic models for regression testing , 2008, ISSTA '08.

[69]  Nagisa Ishiura,et al.  Random Testing of C Compilers Targeting Arithmetic Optimization , 2012 .

[70]  Alastair F. Donaldson,et al.  Metamorphic Testing for (Graphics) Compilers , 2016, 2016 IEEE/ACM 1st International Workshop on Metamorphic Testing (MET).

[71]  Bo Jiang,et al.  Input-based adaptive randomized test case prioritization: A local beam search approach , 2015, J. Syst. Softw..

[72]  Tao Xie,et al.  Learning for test prioritization: an industrial case study , 2016, SIGSOFT FSE.

[73]  Lu Zhang,et al.  An Empirical Comparison of Compiler Testing Techniques , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[74]  Darko Marinov,et al.  Change-aware preemption prioritization , 2011, ISSTA '11.

[75]  Tibor Gyimóthy,et al.  Code coverage-based regression test selection and prioritization in WebKit , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[76]  Baowen Xu,et al.  An empirical study on the effectiveness of time-aware test case prioritization techniques , 2011, SAC.

[77]  Zhendong Su,et al.  Coverage-directed differential testing of JVM implementations , 2016, PLDI.

[78]  Xuejun Yang,et al.  Finding and understanding bugs in C compilers , 2011, PLDI '11.