Making System User Interactive Tests Repeatable: When and What Should we Control?

System user interactive tests are widely used to evaluate the behavior of an application as a whole. To automate this process, many techniques are proposed whose effectiveness are evaluated by metrics such as code coverage and fault detection. However, most of previous work assumes determinism in the outputs of interactive tests. In this paper, we propose three layers of testing outputs to examine: the code layer (codecoverage), the behavioral layer (invariant detection) and the user interaction layer (fault detection with GUI oracle). We further study the impact of common set of factors such as operating system, Java version, initial starting state and time delay on these metrics. A comprehensive experiment has been conducted on Java Swing applications, and the results show that as many as184 lines can be covered differently and up to 96% false positives with respect to fault detection. We plan to study the repeatability of interactive tests on the Android platform.

[1]  Saurabh Sinha,et al.  Guided test generation for web applications , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[2]  Myra B. Cohen,et al.  Configuration-aware regression testing: an empirical study of sampling and prioritization , 2008, ISSTA '08.

[3]  Atif M. Memon,et al.  Designing and comparing automated test oracles for GUI-based software applications , 2007, TSEM.

[4]  Myra B. Cohen,et al.  Making system user interactive tests repeatable: when and what should we control? , 2015, ICSE 2015.

[5]  Paolo Tonella,et al.  An Empirical Validation of a Web Fault Taxonomy and its Usage for Web Testing , 2009, J. Web Eng..

[6]  Marcelo d'Amorim,et al.  Entropy-based test generation for improved fault localization , 2013, 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE).

[7]  Alessandro Orso,et al.  Scaling regression testing to large software systems , 2004, SIGSOFT '04/FSE-12.

[8]  Mark Harman,et al.  Augmenting test suites effectiveness by increasing output diversity , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[9]  Arie van Deursen,et al.  Invariant-based automatic testing of AJAX user interfaces , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[10]  Mark Harman,et al.  Automated web application testing using search based software engineering , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[11]  Emily Hill,et al.  Automated replay and failure detection for web applications , 2005, ASE '05.

[12]  Michael D. Ernst,et al.  Empirically revisiting the test independence assumption , 2014, ISSTA 2014.

[13]  Gordon Fraser,et al.  Automated unit test generation for classes with environment dependencies , 2014, ASE.

[14]  Myra B. Cohen,et al.  Repairing GUI Test Suites Using a Genetic Algorithm , 2010, 2010 Third International Conference on Software Testing, Verification and Validation.

[15]  Darko Marinov,et al.  An empirical analysis of flaky tests , 2014, SIGSOFT FSE.

[16]  Stephen McCamant,et al.  The Daikon system for dynamic detection of likely invariants , 2007, Sci. Comput. Program..

[17]  Hyunsook Do,et al.  An Effective Regression Testing Approach for PHP Web Applications , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[18]  B. Uma Maheswari,et al.  Algorithms for the Detection of Defects in GUI Applications , 2011 .

[19]  Michael Pradel Dynamically inferring, refining, and checking API usage protocols , 2009, OOPSLA Companion.

[20]  Atif M. Memon,et al.  GUITAR: an innovative tool for automated testing of GUI-driven software , 2014, Automated Software Engineering.

[21]  Myra B. Cohen,et al.  Automated testing of GUI applications: Models, tools, and controlling flakiness , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[22]  T. H. Tse,et al.  Test case prioritization for regression testing of service-oriented business applications , 2009, WWW '09.

[23]  Arie van Deursen,et al.  Crawling AJAX by Inferring User Interface State Changes , 2008, 2008 Eighth International Conference on Web Engineering.

[24]  Mika Katara,et al.  Experiences of System-Level Model-Based GUI Testing of an Android Application , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[25]  Porfirio Tramontana,et al.  A toolset for GUI testing of Android applications , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[26]  Andreas Zeller,et al.  Efficient mutation testing by checking invariant violations , 2009, ISSTA.

[27]  Sarfraz Khurshid,et al.  Event Listener Analysis and Symbolic Execution for Testing GUI Applications , 2009, ICFEM.

[28]  Fadi A. Zaraket,et al.  GUICOP: Specification-Based GUI Testing , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[29]  Mark Harman,et al.  Coverage and fault detection of the output-uniqueness test selection criteria , 2014, ISSTA 2014.

[30]  Myra B. Cohen,et al.  Covering array sampling of input event sequences for automated gui testing , 2007, ASE.

[31]  Gregg Rothermel,et al.  An empirical comparison of the fault-detection capabilities of internal oracles , 2013, 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE).

[32]  Xiangyu Zhang,et al.  Virtual DOM coverage for effective testing of dynamic web applications , 2014, ISSTA 2014.

[33]  Paolo Tonella,et al.  Using search-based algorithms for Ajax event sequence generation during testing , 2010, Empirical Software Engineering.

[34]  Chin-Yu Huang,et al.  Design and analysis of GUI test-case prioritization using weight-based methods , 2010, J. Syst. Softw..

[35]  Kai-Yuan Cai,et al.  GUI Software Fault Localization Using N-gram Analysis , 2011, 2011 IEEE 13th International Symposium on High-Assurance Systems Engineering.

[36]  Mark Harman,et al.  An analysis of the relationship between conditional entropy and failed error propagation in software testing , 2014, ICSE.

[37]  David Notkin,et al.  Proceedings of the 43rd International Conference on Software Engineering , 2013, ICSE 2013.