Feedback-directed differential testing of interactive debuggers

To understand, localize, and fix programming errors, developers often rely on interactive debuggers. However, as debuggers are software, they may themselves have bugs, which can make debugging unnecessarily hard or even cause developers to reason about bugs that do not actually exist in their code. This paper presents the first automated testing technique for interactive debuggers. The problem of testing debuggers is fundamentally different from the well-studied problem of testing compilers because debuggers are interactive and because they lack a specification of expected behavior. Our approach, called DBDB, generates debugger actions to exercise the debugger and records traces that summarize the debugger's behavior. By comparing traces of multiple debuggers with each other, we find diverging behavior that points to bugs and other noteworthy differences. We evaluate DBDB on the JavaScript debuggers of Firefox and Chromium, finding 19 previously unreported bugs, eight of which are already fixed by the developers.

[1]  Minkyu Jung,et al.  Testing intermediate representations for binary analysis , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[2]  W. M. McKeeman,et al.  Differential Testing for Software , 1998, Digit. Tech. J..

[3]  Xavier Leroy,et al.  Formal verification of a realistic compiler , 2009, CACM.

[4]  Frank Tip,et al.  Test generation for higher-order functions in dynamic languages , 2018, Proc. ACM Program. Lang..

[5]  Xin Zhang,et al.  Effective interactive resolution of static analysis alarms , 2017, Proc. ACM Program. Lang..

[6]  Thomas R. Gross,et al.  Automatic testing of sequential and concurrent substitutability , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[7]  Flemming Nielson,et al.  Effect-driven QuickChecking of compilers , 2017, Proc. ACM Program. Lang..

[8]  Lu Zhang,et al.  An Empirical Comparison of Compiler Testing Techniques , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[9]  Darko Marinov,et al.  Automated testing of refactoring engines , 2007, ESEC-FSE '07.

[10]  Bin Wang,et al.  Automated support for classifying software failure reports , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[11]  Cristian Cadar,et al.  Automatic testing of symbolic execution engines via program generation and differential testing , 2017, 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE).

[12]  Michael Pradel,et al.  Monkey see, monkey do: effective generation of GUI tests with inferred macro events , 2017, Software Engineering.

[13]  Zhendong Su,et al.  Skeletal program enumeration for rigorous compiler testing , 2016, PLDI.

[14]  Nancy G. Leveson,et al.  An experimental evaluation of the assumption of independence in multiversion programming , 1986, IEEE Transactions on Software Engineering.

[15]  Hongyu Zhang,et al.  Learning to Prioritize Test Programs for Compiler Testing , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[16]  Alessandro Orso,et al.  WEBDIFF: Automated identification of cross-browser issues in web applications , 2010, 2010 IEEE International Conference on Software Maintenance.

[17]  Miryung Kim,et al.  Automated Transplantation and Differential Testing for Clones , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[18]  Xuejun Yang,et al.  Finding and understanding bugs in C compilers , 2011, PLDI '11.

[19]  Alastair F. Donaldson,et al.  Automated testing of graphics shader compilers , 2017, Proc. ACM Program. Lang..

[20]  Zhendong Su,et al.  Compiler validation via equivalence modulo inputs , 2014, PLDI.

[21]  Michael Pradel,et al.  Learning to Fuzz: Application-Independent Fuzz Testing with Probabilistic, Generative Models of Input Data , 2016 .

[22]  Fernando Pereira,et al.  Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[23]  Alastair F. Donaldson,et al.  Many-core compiler fuzzing , 2015, PLDI.

[24]  Ali Mesbah,et al.  Automated cross-browser compatibility testing , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[25]  Iulian Neamtiu,et al.  Automating GUI testing for Android applications , 2011, AST '11.

[26]  Eugene W. Stark,et al.  Operational semantics of a focusing debugger , 1995, MFPS.

[27]  Alastair F. Donaldson,et al.  Analysing the Program Analyser , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[28]  Lorenzo Martignoni,et al.  N-version disassembly: differential testing of x86 disassemblers , 2010, ISSTA '10.

[29]  Jack J. Dongarra,et al.  Vectorizing compilers: a test suite and results , 1988, Proceedings. SUPERCOMPUTING '88.

[30]  Alex Groce,et al.  Taming compiler fuzzers , 2013, ACM-SIGPLAN Symposium on Programming Language Design and Implementation.

[31]  Norman Ramsey,et al.  Correctness of trap-based breakpoint implementations , 1994, POPL '94.

[32]  Mayur Naik,et al.  Dynodroid: an input generation system for Android apps , 2013, ESEC/FSE 2013.

[33]  Andreas Zeller,et al.  Fuzzing with Code Fragments , 2012, USENIX Security Symposium.

[34]  Paul Walton Purdom,et al.  A sentence generator for testing parsers , 1972 .

[35]  Jack J. Dongarra,et al.  Parallel loops - a test suite for parallelizing compilers: description and example results , 1991, Parallel Comput..