Improving oracle quality by detecting brittle assertions and unused inputs in tests

Writing oracles is challenging. As a result, developers often create oracles that check too little, resulting in tests that are unable to detect failures, or check too much, resulting in tests that are brittle and difficult to maintain. In this paper we present a new technique for automatically analyzing test oracles. The technique is based on dynamic tainting and detects both brittle assertions—assertions that depend on values that are derived from uncontrolled inputs—and unused inputs—inputs provided by the test that are not checked by an assertion. We also presented OraclePolish, an implementation of the technique that can analyze tests that are written in Java and use the JUnit testing framework. Using OraclePolish, we conducted an empirical evaluation of more than 4000 real test cases. The results of the evaluation show that OraclePolish is effective; it detected 164 tests that contain brittle assertions and 1618 tests that have unused inputs. In addition, the results also demonstrate that the costs associated with using the technique are reasonable.

[1]  Tao Xie,et al.  DiffGen: Automated Regression Unit-Test Generation , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[2]  Atif M. Memon,et al.  Designing and comparing automated test oracles for GUI-based software applications , 2007, TSEM.

[3]  Richard J. Lipton,et al.  Hints on Test Data Selection: Help for the Practicing Programmer , 1978, Computer.

[4]  Koushik Sen,et al.  DART: directed automated random testing , 2005, PLDI '05.

[5]  Gregory Gay,et al.  Automated oracle creation support, or: How I learned to stop worrying about fault propagation and love mutation testing , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[6]  Michael D. Ernst,et al.  Eclat: Automatic Generation and Classification of Test Inputs , 2005, ECOOP.

[7]  Herbert Bos,et al.  Minemu: The World's Fastest Taint Tracker , 2011, RAID.

[8]  Frank Piessens,et al.  State Coverage: Software Validation Metrics beyond Code Coverage , 2012, SOFSEM.

[9]  Andreas Zeller,et al.  Mutation-Driven Generation of Unit Tests and Oracles , 2010, IEEE Transactions on Software Engineering.

[10]  Gregg Rothermel,et al.  Dodona: automated oracle data set selection , 2014, ISSTA 2014.

[11]  Ali Mesbah,et al.  Automated cross-browser compatibility testing , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[12]  Alessandro Orso,et al.  Penumbra: automatically identifying failure-relevant inputs using dynamic tainting , 2009, ISSTA.

[13]  Barton P. Miller,et al.  An empirical study of the reliability of UNIX utilities , 1990, Commun. ACM.

[14]  Richard G. Hamlet,et al.  Testing Programs with the Aid of a Compiler , 1977, IEEE Transactions on Software Engineering.

[15]  Alessandro Orso,et al.  Dytan: a generic dynamic taint analysis framework , 2007, ISSTA '07.

[16]  David Kao,et al.  State coverage: a structural test adequacy criterion for behavior checking , 2007, ESEC-FSE companion '07.

[17]  Andreas Zeller,et al.  Assessing Oracle Quality with Checked Coverage , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[18]  Nael B. Abu-Ghazaleh,et al.  SIFT: a low-overhead dynamic information flow tracking architecture for SMT processors , 2011, CF '11.

[19]  Yannis Smaragdakis,et al.  JCrasher: an automatic robustness tester for Java , 2004, Softw. Pract. Exp..

[20]  Koushik Sen,et al.  CUTE: a concolic unit testing engine for C , 2005, ESEC/FSE-13.

[21]  R. Sekar,et al.  Efficient fine-grained binary instrumentationwith applications to taint-tracking , 2008, CGO '08.

[22]  Ken Koster,et al.  A state coverage tool for JUnit , 2008, ICSE Companion '08.

[23]  Sarfraz Khurshid,et al.  Test input generation with java PathFinder , 2004, ISSTA '04.

[24]  David Notkin,et al.  Symstra: A Framework for Generating Object-Oriented Unit Tests Using Symbolic Execution , 2005, TACAS.

[25]  Alessandro Orso,et al.  WEBDIFF: Automated identification of cross-browser issues in web applications , 2010, 2010 IEEE International Conference on Software Maintenance.

[26]  Michael D. Ernst,et al.  Scaling up automated test generation: Automatically generating maintainable regression unit tests for programs , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[27]  Michael D. Ernst,et al.  Feedback-Directed Random Test Generation , 2007, 29th International Conference on Software Engineering (ICSE'07).

[28]  Michael D. Ernst,et al.  Empirically revisiting the test independence assumption , 2014, ISSTA 2014.