Defect Detection Efficiency: Test Case Based vs. Exploratory Testing

This paper presents a controlled experiment comparing the defect detection efficiency of exploratory testing (ET) and test case based testing (TCT). While traditional testing literature emphasizes test cases, ET stresses the individual tester's skills during test execution and does not rely upon predesigned test cases. In the experiment, 79 advanced software engineering students performed manual functional testing on an open-source application with actual and seeded defects. Each student participated in two 90-minute controlled sessions, using ET in one and TCT in the other. We found no significant differences in defect detection efficiency between TCT and ET. The distributions of detected defects did not differ significantly regarding technical type, detection difficulty, or severity. However, TCT produced significantly more false defect reports than ET. Surprisingly, our results show no benefit of using predesigned test cases in terms of defect detection efficiency, emphasizing the need for further studies of manual testing.

[1]  Dietmar Ernst,et al.  Defect Detection for Executable Specifications - An Experiment , 2002, Int. J. Softw. Eng. Knowl. Eng..

[2]  James Lyndsay,et al.  Adventures in Session-Based Testing , 2003 .

[3]  Juha Itkonen,et al.  Exploratory testing: a multiple case study , 2005, 2005 International Symposium on Empirical Software Engineering, 2005..

[4]  S. Berner,et al.  Observations and lessons learned from automated testing , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[5]  Leonard J. Bass,et al.  SAAM: a method for analyzing the properties of software architectures , 1994, Proceedings of 16th International Conference on Software Engineering.

[6]  Sean McDonald,et al.  Software Test Automation , 1999 .

[7]  P. Yetton,et al.  The relationships among group size, member ability, social decision schemes, and performance , 1983 .

[8]  Margaret M. Burnett,et al.  Garbage in, garbage out? An empirical look at oracle mistakes by end-user programmers , 2005, 2005 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC'05).

[9]  Shari Lawrence Pfleeger,et al.  Preliminary Guidelines for Empirical Research in Software Engineering , 2002, IEEE Trans. Software Eng..

[10]  Kent L. Beck,et al.  Embracing Change with Extreme Programming , 1999, Computer.

[11]  Pekka Abrahamsson,et al.  New directions on agile methods: a comparative analysis , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[12]  Natalia Juristo Juzgado,et al.  Reviewing 25 Years of Testing Technique Experiments , 2004, Empirical Software Engineering.

[13]  Jarmo J. Ahonen,et al.  Impacts of the Organizational Model on Testing: Three Industrial Cases , 2004, Empirical Software Engineering.

[14]  Victor R. Basili,et al.  A Methodology for Collecting Valid Software Engineering Data , 1984, IEEE Transactions on Software Engineering.

[15]  Natalia Juristo Juzgado,et al.  Basics of Software Engineering Experimentation , 2010, Springer US.

[16]  R. Kaplan,et al.  The balanced scorecard--measures that drive performance. , 2015, Harvard business review.

[17]  Hans van Vliet,et al.  How well can we predict changes at architecture design time? , 2003, J. Syst. Softw..

[18]  Jan Bosch,et al.  Experiences with ALMA: Architecture-Level Modifiability Analysis , 2002, J. Syst. Softw..

[19]  James Miller,et al.  Comparing and combining software defect detection techniques: a replicated empirical study , 1997, ESEC '97/FSE-5.

[20]  Claes Wohlin,et al.  Using Students as Subjects—A Comparative Study of Students and Professionals in Lead-Time Impact Assessment , 2000, Empirical Software Engineering.

[21]  Erik Kamsties,et al.  An Empirical Evaluation of Three Defect-Detection Techniques , 1995, ESEC.

[22]  Leonard J. Bass,et al.  Scenario-Based Analysis of Software Architecture , 1996, IEEE Softw..

[23]  Gregg Rothermel,et al.  WYSIWYT testing in the spreadsheet paradigm: an empirical evaluation , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[24]  Glenford J. Myers,et al.  Art of Software Testing , 1979 .

[25]  Robert L. Glass,et al.  Science and substance: a challenge to software engineers , 1994, IEEE Software.

[26]  Victor R. Basili,et al.  Comparing the Effectiveness of Software Testing Strategies , 1987, IEEE Transactions on Software Engineering.

[27]  Claes Wohlin,et al.  State‐of‐the‐art: software inspections after 25 years , 2002, Softw. Test. Verification Reliab..

[28]  Kent L. Beck,et al.  Test-driven Development - by example , 2002, The Addison-Wesley signature series.

[29]  Lee Copeland,et al.  A Practitioner's Guide to Software Test Design , 2003 .

[30]  Liming Zhu,et al.  An empirical study of groupware support for distributed software architecture evaluation process , 2004, J. Syst. Softw..

[31]  Ilene Burnstein,et al.  Practical Software Testing , 2003, Springer Professional Computing.

[32]  Per Runeson,et al.  Verification and validation in industry - a qualitative survey on the state of practice , 2002, Proceedings International Symposium on Empirical Software Engineering.

[33]  Jan Bosch,et al.  An experiment on creating scenario profiles for software change , 2000, Ann. Softw. Eng..

[34]  Boris Beizer,et al.  Software Testing Techniques , 1983 .

[35]  Jarle Våga,et al.  Managing high-speed web testing , 2002 .