An empirical study on constraint optimization techniques for test generation

Constraint solving is a frequent, but expensive operation with symbolic execution to generate tests for a program. To improve the efficiency of test generation using constraint solving, four optimization techniques are usually applied to existing constraint solvers, which are constraint independence, constraint set simplification, constraint caching, and expression rewriting. In this paper, we conducted an empirical study, using these four constraint optimization techniques in a well known test generation tool KLEE with 77 GNU Coreutils applications, to systematically investigate how these optimization techniques affect the efficiency of test generation. The experimental results show that these constraint optimization techniques as well as their combinations cannot improve the efficiency of test generation significantly for ALL-SIZED programs. Moreover, we studied the constraint optimization techniques with respect to two static metrics, lines of code (LOC) and cyclomatic complexity (CC), of programs. The experimental results show that the “constraint set simplification” technique can improve the efficiency of test generation significantly for the programs with high LOC and CC values. The “constraint caching” optimization technique can improve the efficiency of test generation significantly for the programs with low LOC and CC values. Finally, we propose four hybrid optimization strategies and practical guidelines based on different static metrics.创新点我们使用KLEE执行了77个GNU Coreutils程序,用于系统的研究4种流行的约束优化技术如何影响测试用例生成的效率。结果表明,对于所有程序,这些约束优化技术及它们的组合不能大幅提高测试用例生成的效率。 此外,我们研究了程序的两个静态指标(代码行和圈复杂度)跟约束优化技术的关系。结果表明,对于大规模及高圈复杂度程序,使用“约束集简化”技术,对于小规模及低圈复杂度程序,使用“约束缓存”技术,都可以显著提高测试生成效率。最后,基于不同的静态指标,我们提出了四种混合优化策略和指导方针。

[1]  Zhenkai Liang,et al.  Automatically Identifying Trigger-based Behavior in Malware , 2008, Botnet Detection.

[2]  M. Shepperd,et al.  A critique of cyclomatic complexity as a software metric , 1988, Softw. Eng. J..

[3]  A. K. Chandra,et al.  Constraint solving for test case generation: a technique for high-level design verification , 1992, Proceedings 1992 IEEE International Conference on Computer Design: VLSI in Computers & Processors.

[4]  Dawson R. Engler,et al.  Execution Generated Test Cases: How to Make Systems Code Crash Itself , 2005, SPIN.

[5]  Dawson R. Engler,et al.  KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs , 2008, OSDI.

[6]  Armin Biere,et al.  Boolector: An Efficient SMT Solver for Bit-Vectors and Arrays , 2009, TACAS.

[7]  Bogdan Korel,et al.  The chaining approach for software test data generation , 1996, TSEM.

[8]  Arnaud Gotlieb,et al.  Automatic test data generation using constraint solving techniques , 1998, ISSTA '98.

[9]  Gregg Rothermel,et al.  Software testing: a research travelogue (2000–2014) , 2014, FOSE.

[10]  Baowen Xu,et al.  EFSM-Based Test Case Generation: Sequence, Data, and Oracle , 2015, Int. J. Softw. Eng. Knowl. Eng..

[11]  Hao Wang,et al.  Theory and Techniques for Automatic Generation of Vulnerability-Based Signatures , 2008, IEEE Transactions on Dependable and Secure Computing.

[12]  Cristian Cadar,et al.  Multi-solver Support in Symbolic Execution , 2013, SMT.

[13]  Patrice Godefroid,et al.  Automated Whitebox Fuzz Testing , 2008, NDSS.

[14]  Alessandro Orso,et al.  Optimizing Constraint Solving to Better Support Symbolic Execution , 2011, 2011 IEEE Fourth International Conference on Software Testing, Verification and Validation Workshops.

[15]  Craig A. Tovey,et al.  A simplified NP-complete satisfiability problem , 1984, Discret. Appl. Math..

[16]  Zhenkai Liang,et al.  Fast and automated generation of attack signatures: a basis for building self-protecting servers , 2005, CCS '05.

[17]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[18]  Ting Chen,et al.  State of the art: Dynamic symbolic execution for automated test generation , 2013, Future Gener. Comput. Syst..

[19]  Boris Beizer,et al.  Software Testing Techniques , 1983 .

[20]  Christopher Krügel,et al.  Exploring Multiple Execution Paths for Malware Analysis , 2007, 2007 IEEE Symposium on Security and Privacy (SP '07).

[21]  Zhenkai Liang,et al.  BitBlaze: A New Approach to Computer Security via Binary Analysis , 2008, ICISS.

[22]  David Brumley,et al.  Vulnerability-Specific Execution Filtering for Exploit Prevention on Commodity Software , 2006, NDSS.

[23]  Aaron Stump,et al.  Design and Results of the First Satisfiability Modulo Theories Competition (SMT-COMP 2005) , 2005, Journal of Automated Reasoning.

[24]  A. Jefferson Offutt,et al.  A semantic model of program faults , 1996, ISSTA '96.

[25]  Myra B. Cohen,et al.  An orchestrated survey of methodologies for automated software test case generation , 2013, J. Syst. Softw..

[26]  Klaus Schittkowski,et al.  NLPQL: A fortran subroutine solving constrained nonlinear programming problems , 1986 .

[27]  Zhenkai Liang,et al.  BitScope: Automatically Dissecting Malicious Binaries , 2007 .

[28]  Boris Beizer,et al.  Software testing techniques (2. ed.) , 1990 .

[29]  Dawson R. Engler,et al.  EXE: automatically generating inputs of death , 2006, CCS '06.

[30]  James C. King,et al.  Symbolic execution and program testing , 1976, CACM.

[31]  Paul Pettersson,et al.  Tools and Algorithms for the Construction and Analysis of Systems: 28th International Conference, TACAS 2022, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Munich, Germany, April 2–7, 2022, Proceedings, Part II , 1998, TACAS.

[32]  Baowen Xu,et al.  Comparing logic coverage criteria on test case prioritization , 2012, Science China Information Sciences.

[33]  A. Jefferson Offutt,et al.  Constraint-Based Automatic Test Data Generation , 1991, IEEE Trans. Software Eng..

[34]  C. Jones,et al.  Software metrics: good, bad and missing , 1994, Computer.

[35]  Koushik Sen DART: Directed Automated Random Testing , 2009, Haifa Verification Conference.

[36]  David L. Dill,et al.  A Decision Procedure for Bit-Vectors and Arrays , 2007, CAV.

[37]  Nikolaj Bjørner,et al.  Z3: An Efficient SMT Solver , 2008, TACAS.

[38]  Hao Wang,et al.  Towards automatic generation of vulnerability-based signatures , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[39]  Hao Wang,et al.  Creating Vulnerability Signatures Using Weakest Preconditions , 2007, 20th IEEE Computer Security Foundations Symposium (CSF'07).