Finding the Optimal Balance between Over and Under Approximation of Models Inferred from Execution Logs

Models inferred from execution traces (logs) may admit more behaviours than those possible in the real system (over-approximation) or may exclude behaviours that can indeed occur in the real system (under-approximation). Both problems negatively affect model based testing. In fact, over-approximation results in infeasible test cases, i.e., test cases that cannot be activated by any input data. Under-approximation results in missing test cases, i.e., system behaviours that are not represented in the model are also never tested. In this paper we balance over- and under-approximation of inferred models by resorting to multi-objective optimization achieved by means of two search-based algorithms: A multi-objective Genetic Algorithm (GA) and the NSGA-II. We report the results on two open-source web applications and compare the multi-objective optimization to the state-of-the-art KLFA tool. We show that it is possible to identify regions in the Pareto front that contain models which violate fewer application constraints and have a higher bug detection ratio. The Pareto fronts generated by the multi-objective GA contain a region where models violate on average 2% of an application's constraints, compared to 2.8% for NSGA-II and 28.3% for the KLFA models. Similarly, it is possible to identify a region on the Pareto front where the multi-objective GA inferred models have an average bug detection ratio of 110 : 3 and the NSGA-II inferred models have an average bug detection ratio of 101 : 6. This compares to a bug detection ratio of 310928 : 13 for the KLFA tool.

[1]  Yuriy Brun,et al.  Using dynamic execution traces and program invariants to enhance behavioral model inference , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[2]  Leonardo Mariani,et al.  Dynamic Detection of COTS Component Incompatibility , 2007, IEEE Software.

[3]  Sriram Raghavan,et al.  Regular Expression Learning for Information Extraction , 2008, EMNLP.

[4]  Steven P. Reiss,et al.  Encoding program executions , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[5]  DebK.,et al.  A fast and elitist multiobjective genetic algorithm , 2002 .

[6]  Leonardo Mariani,et al.  Automated Identification of Failure Causes in System Logs , 2008, 2008 19th International Symposium on Software Reliability Engineering (ISSRE).

[7]  Jerome A. Feldman,et al.  On the Synthesis of Finite-State Machines from Samples of Their Behavior , 1972, IEEE Transactions on Computers.

[8]  Stefan C. Kremer,et al.  Inducing Grammars from Sparse Data Sets: A Survey of Algorithms and Results , 2003, J. Mach. Learn. Res..

[9]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[10]  Neil Walkinshaw,et al.  A framework for the competitive evaluation of model inference techniques , 2010, MIIT '10.

[11]  Aurélien Lemay,et al.  Learning regular languages using RFSAs , 2004, Theor. Comput. Sci..

[12]  Paolo Tonella,et al.  State-Based Testing of Ajax Web Applications , 2008, 2008 1st International Conference on Software Testing, Verification, and Validation.

[13]  Andreas Zeller,et al.  Mining object behavior with ADABU , 2006, WODA '06.

[14]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[15]  Christoph C. Michael,et al.  Using Finite Automata to Mine Execution Data for Intrusion Detection: A Preliminary Report , 2000, Recent Advances in Intrusion Detection.

[16]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[17]  Marijn J. H. Heule,et al.  Exact DFA Identification Using SAT Solvers , 2010, ICGI.

[18]  Rajesh Parekh,et al.  Grammar Inference Automata Induction and Language Acquisition , 2005 .

[19]  Leonardo Mariani,et al.  Automatic generation of software behavioral models , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[20]  J. Feldman,et al.  Learning Automata from Ordered Examples , 1991 .

[21]  James R. Larus,et al.  Mining specifications , 2002, POPL '02.

[22]  Paolo Tonella,et al.  Search-Based Testing of Ajax Web Applications , 2009, 2009 1st International Symposium on Search Based Software Engineering.