Pairwise Testing : A Best Practice That Isn ’

Pairwise testing is a wildly popular approach to combinatorial testing problems. The number of articles and textbooks covering the topic continues to grow, as do the number of commercial and academic courses that teach the technique. Despite the technique's popularity and its reputation as a best practice, we find the technique to be over promoted and poorly understood. In this paper, we define pairwise testing and review many of the studies conducted using pairwise testing. Based on these studies and our experience with pairwise testing, we discuss weaknesses we perceive in pairwise testing. Knowledge of the weaknesses of the pairwise testing technique, or of any testing technique, is essential if we are to apply the technique wisely. We conclude by re-stating the story of pairwise testing and by warning testers against blindly accepting best practices. No Simple Formulas for Good Testing A commonly cited rule of thumb among test managers is that testing accounts for half the budget of a typical complex commercial software project. But testing isn’t just expensive, it’s arbitrarily expensive. That’s because there are more distinct imaginable test cases for even a simple software product then can be performed in the natural lifetime of any tester [1]. Pragmatic software testing, therefore, requires that we take shortcuts that keep costs down. Each shortcut has its pitfalls. We absolutely need shortcuts. But we also need to choose them and manage them wisely. Contrary to the fondest wishes of management, there are no pat formulas that dictate the best course of testing. In testing software, there are no "best practices" that we simply must "follow" in order to achieve success. Take domain partitioning as an example. In this technique, instead of testing with every possible value of every possible variable, the tester divides test or test conditions into different sets wherein each member of each set is more or less equivalent to any other member of the same set for the purposes of discovering defects. These are called equivalence classes. For example, instead of testing with every single model of printer, the tester might treat all Hewlett-Packard inkjet printers as roughly equivalent. He might therefore test with only one of them as a representative of that entire set. This method can save us a vast amount of time and energy without risking too much of our confidence in our test coverage— as long as we can tell what is equivalent to what. This turns out to be quite difficult, in many cases. And if we get it wrong, our test coverage won’t be as good as we think it is [2]. Despite the problem that it’s not obvious what tests or test data are actually equivalent among the many possibilities, domain partitioning is touted as a technique we should be using [3]. But the instruction "do domain testing" is almost useless, unless the tester is well versed in the technology to be tested and proficient in the analysis required for domain partitioning. It trivializes the testing craft to promote a practice without promoting the skill and knowledge needed to do it well. This article is about another apparent best practice that turns out to be less than it seems: pairwise testing. Pairwise testing can be helpful, or it can create false confidence. On the whole, we believe that this technique is over promoted and poorly understood. To apply it wisely, we think it’s important to put pairwise testing into a sensible perspective. Combinations are Hard to Test Combinatorial testing is a problem that faces us whenever we have a product that processes multiple variables that may interact. The variables may come from a variety of sources, such the user interface, the operating system, peripherals, a database, or from across a network. The task in combinatorial testing goes beyond testing individual variables (although that must also be done, as well). In combinatorial testing the task is to verify that different combinations of variables are handled correctly by the system. An example of a combinatorial testing problem is the options dialog of Microsoft Word. Consider just one sub-section of one panel (fig. 1). Maybe when the status bar is turned off, and the vertical scroll bar is also turned off, the product will crash. Or maybe it will crash only when the status bar is on and the vertical bar is off. If you consider any of those conditions to represent interesting risks, you will want to test them. Figure 1. Portion of Options Dialog in MS Word Combinatorial testing is difficult because of the large number of possible test cases (a result of the "combinatorial explosion" of selected test data values for the system's input variables). For example, in the Word options dialog box example, there are 12,288 combinations (2 12 times 3 for the Field shading menu, which contains three items). Running all possible combinatorial test cases is generally not possible due to the large amount of time and resources required. Pairwise Testing is a Combinatorial Testing Shortcut Pairwise testing is an economical alternative to testing all possible combinations of a set of variables. In pairwise testing a set of test cases is generated that covers all combinations of the selected test data values for each pair of variables. Pairwise testing is also referred to as allpairs testing and 2-way testing. It is also possible to do all triples (3-way) or all quadruples (4way) testing, of course, but the size of the higher order test sets grows very rapidly. Pairwise testing normally begins by selecting values for the system’s input variables. These individual values are often selected using domain partitioning. The values are then permuted to achieve coverage of all the pairings. This is very tedious to do by hand. Practical techniques used to create pairwise test sets include Orthogonal Arrays, a technique borrowed from the design of statistical experiments [4-6], and algorithmic approaches such as the greedy algorithm, presented by Cohen, et al. [7]. A free, open source tool to produce pairwise test cases, written by one of the authors (Bach), is available from Satisfice, Inc. [http://www.satisfice.com/tools/pairs.zip]. For a simple example of pairwise testing, consider the system in Figure 2. System S has three input variables X, Y, and Z. Assume that set D, a set of test data values, has been selected for each of the input variables such that D(X) = {1, 2}; D(Y) = {Q, R}; and D(Z) = {5, 6}. The total number of possible test cases is 2×2×2 = 8 test cases. The pairwise test set has a size of only 4 test cases and is shown in Table 1. Table 1: Pairwise test cases for System S Test ID Input X Input Y Input Z

[1]  Robert Mandl,et al.  Orthogonal Latin squares: an application of experiment design to compiler testing , 1985, CACM.

[2]  Richard G. Hamlet,et al.  Partition Testing Does Not Inspire Confidence , 1990, IEEE Trans. Software Eng..

[3]  Robert Brownlie,et al.  Robust testing of AT&T PMX/StarMAIL using OATS , 1992, AT&T Technical Journal.

[4]  Siddhartha R. Dalal,et al.  The Automatic Efficient Test Generator (AETG) system , 1994, Proceedings of 1994 IEEE International Symposium on Software Reliability Engineering.

[5]  R. L. Erickson,et al.  Improved quality of protocol testing through techniques of experimental design , 1994, Proceedings of ICC/SUPERCOMM'94 - 1994 International Conference on Communications.

[6]  Joseph Robert Horgan,et al.  Effect of test set size and block coverage on the fault detection effectiveness , 1994, Proceedings of 1994 IEEE International Symposium on Software Reliability Engineering.

[7]  Lee J. White Regression testing of GUI event interactions , 1996, 1996 Proceedings of International Conference on Software Maintenance.

[8]  D.M. Cohen,et al.  The Combinatorial Design Approach to Automatic Test Generation , 1996, IEEE Softw..

[9]  Robert L. Probert,et al.  A practical strategy for testing pair-wise coverage of network interfaces , 1996, Proceedings of ISSRE '96: 7th International Symposium on Software Reliability Engineering.

[10]  Michael L. Fredman,et al.  The AETG System: An Approach to Testing Based on Combinatiorial Design , 1997, IEEE Trans. Software Eng..

[11]  Yashwant K. Malaiya,et al.  Automatic test generation using checkpoint encoding and antirandom testing , 1997, Proceedings The Eighth International Symposium on Software Reliability Engineering.

[12]  C. L. Mallows,et al.  Applying Design of Experiments to Software Testing , 1997, Proceedings of the (19th) International Conference on Software Engineering.

[13]  Yu Lei,et al.  In-parameter-order: a test generation strategy for pairwise testing , 1998, Proceedings Third IEEE International High-Assurance Systems Engineering Symposium (Cat. No.98EX231).

[14]  Ashish Jain,et al.  Model-based testing of a highly programmable system , 1998, Proceedings Ninth International Symposium on Software Reliability Engineering (Cat. No.98TB100257).

[15]  Colin L. Mallows,et al.  Factor-covering designs for testing software , 1998 .

[16]  Nicola Muscettola,et al.  Challenges and Methods in Testing the Remote Agent Planner , 2000, AIPS.

[17]  Jeremy M. Harrell Orthogonal Array Testing Strategy (OATS) Technique , 2001 .

[18]  Steven Splaine,et al.  The web testing handbook , 2001 .

[19]  John D. McGregor,et al.  A Practical Guide to Testing Object-Oriented Software , 2001, Addison Wesley object technology series.

[20]  D. Richard Kuhn,et al.  FAILURE MODES IN MEDICAL DEVICE SOFTWARE: AN ANALYSIS OF 15 YEARS OF RECALL DATA , 2001 .

[21]  Stefan P. Jaskiel,et al.  Systematic Software Testing , 2002 .

[22]  Bogdan Korel,et al.  Generating expected results for automated black-box testing , 2002, Proceedings 17th IEEE International Conference on Automated Software Engineering,.

[23]  Cem Kaner,et al.  Lessons learned in software testing ; a context - driven approach , 2002 .

[24]  Lee Copeland,et al.  A Practitioner's Guide to Software Test Design , 2003 .

[25]  Cem Kaner Teaching domain testing: a status report , 2004, 17th Conference on Software Engineering Education and Training, 2004. Proceedings..

[26]  Patrick J. Schroeder,et al.  Comparing the fault detection effectiveness of n-way and random test suites , 2004, Proceedings. 2004 International Symposium on Empirical Software Engineering, 2004. ISESE '04..

[27]  Alexander Pretschner,et al.  Model-Based Testing in Practice , 2005, FM.