Comparing the effectiveness of testing methods in improving programs: the effect of variations in program quality

We compare the efficacy of different testing methods for improving the reliability of software. Specifically, we use modelling to compare "operational" testing, in which test cases are chosen according to their probability of occurring in actual use of the software, against "debug" testing methods, in which the testers look for test cases which they consider likely to cause failure, or that satisfy some coverage criterion. We base our comparisons on the reliability reached by the program at the end of testing. Differently from previous studies, we consider the probability distribution of the achieved reliability, and thus the probability of satisfying specific requirements, rather than just the average reliability achieved. We take account of two sources of variation. The variation between the actual test histories that are possible for a given program and a given test method: and the fact that different programs start testing with different faults and initial reliability levels. By necessity, we use very simplified models of reality. Yet, we can show some interesting conclusions with important practical consequences. In general, there are stronger arguments in favor of operational testing than previous studies have shown.

[1]  Yashwant K. Malaiya,et al.  On input profile selection for software testing , 1994, Proceedings of 1994 IEEE International Symposium on Software Reliability Engineering.

[2]  Bev Littlewood,et al.  Choosing a Testing Method to Deliver Reliability , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[3]  Simeon C. Ntafos,et al.  An Evaluation of Random Testing , 1984, IEEE Transactions on Software Engineering.

[4]  Richard G. Hamlet,et al.  Partition Testing Does Not Inspire Confidence , 1990, IEEE Trans. Software Eng..

[5]  Elaine J. Weyuker,et al.  A Formal Analysis of the Fault-Detecting Ability of Testing Methods , 1993, IEEE Trans. Software Eng..

[6]  Joseph Robert Horgan,et al.  Effect of test set size and block coverage on the fault detection effectiveness , 1994, Proceedings of 1994 IEEE International Symposium on Software Reliability Engineering.

[7]  Elaine J. Weyuker,et al.  Analyzing Partition Testing Strategies , 1991, IEEE Trans. Software Eng..

[8]  W. Feller,et al.  An Introduction to Probability Theory and Its Application. , 1951 .

[9]  Tsong Yueh Chen,et al.  On the Expected Number of Failures Detected by Subdomain Testing and Random Testing , 1996, IEEE Trans. Software Eng..

[10]  Bev Littlewood,et al.  Evaluating Testing Methods by Delivered Reliability , 1998, IEEE Trans. Software Eng..

[11]  Phyllis G. Frankl,et al.  An Experimental Comparison of the Effectiveness of Branch Testing and Data Flow Testing , 1993, IEEE Trans. Software Eng..