Perses: Syntax-Guided Program Reduction

Given a program P that exhibits a certain property ψ (e.g., a C program that crashes GCC when it is being compiled), the goal of program reduction is to minimize P to a smaller variant P? that still exhibits the same property, i.e., ψ(P'). Program reduction is important and widely demanded for testing and debugging. For example, all compiler/interpreter development projects need effective program reduction to minimize failure-inducing test programs to ease debugging. However, state-of-the-art program reduction techniques — notably Delta Debugging (DD), Hierarchical Delta Debugging (HDD), and C-Reduce — do not perform well in terms of speed (reduction time) and quality (size of reduced programs), or are highly customized for certain languages and thus lack generality. This paper presents Perses, a novel framework for effective, efficient, and general program reduction. The key insight is to exploit, in a general manner, the formal syntax of the programs under reduction and ensure that each reduction step considers only smaller, syntactically valid variants to avoid futile efforts on syntactically invalid variants. Our framework supports not only deletion (as for DD and HDD), but also general, effective program transformations. We have designed and implemented Perses, and evaluated it for two language settings: C and Java. Our evaluation results on 20 C programs triggering bugs in GCC and Clang demonstrate Perses's strong practicality compared to the state-of-the-art: (1) smaller size — Perses's results are respectively 2% and 45% in size of those from DD and HDD; and (2) shorter reduction time — Perses takes 23% and 47% time taken by DD and HDD respectively. Even when compared to the highly customized and optimized C-Reduce for C/C++, Perses takes only 38-60% reduction time.

[1]  Mark Harman,et al.  ORBS: language-independent program slicing , 2014, SIGSOFT FSE.

[2]  Zhendong Su,et al.  Finding deep compiler bugs via guided stochastic program mutation , 2015, OOPSLA.

[3]  Michael Pradel,et al.  Automatically reducing tree-structured test inputs , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[4]  Zhendong Su,et al.  Compiler validation via equivalence modulo inputs , 2014, PLDI.

[5]  Ben Stock,et al.  25 million flows later: large-scale detection of DOM-based XSS , 2013, CCS.

[6]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[7]  Shin Yoo,et al.  Seeing Is Slicing: Observation Based Slicing of Picture Description Languages , 2014, 2014 IEEE 14th International Working Conference on Source Code Analysis and Manipulation.

[8]  C. A. R. Hoare,et al.  The verifying compiler: A grand challenge for computing research , 2003, JACM.

[9]  Xuejun Yang,et al.  Test-case reduction for C compiler bugs , 2012, PLDI.

[10]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[11]  Tibor Gyimóthy,et al.  Coarse Hierarchical Delta Debugging , 2017, 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME).

[12]  Steve Hanna,et al.  FLAX: Systematic Discovery of Client-side Validation Vulnerabilities in Rich Web Applications , 2010, NDSS.

[13]  Zhendong Su,et al.  Finding compiler bugs via live code mutation , 2016, OOPSLA.

[14]  Zhendong Su,et al.  Skeletal program enumeration for rigorous compiler testing , 2016, PLDI.

[15]  Zhendong Su,et al.  Toward understanding compiler bugs in GCC and LLVM , 2016, ISSTA.

[16]  Zhendong Su,et al.  Randomized stress-testing of link-time optimizers , 2015, ISSTA.

[17]  R. Weisberg A-N-D , 2011 .

[18]  Xuejun Yang,et al.  Finding and understanding bugs in C compilers , 2011, PLDI '11.

[19]  Zhendong Su,et al.  HDD: hierarchical delta debugging , 2006, ICSE.