Navigating Information Overload Caused by Automated Testing - a Clustering Approach in Multi-Branch Development

Background. Test automation is a widely used technique to increase the efficiency of software testing. However, executing more test cases increases the effort required to analyze test results. At Qlik, automated tests run nightly for up to 20 development branches, each containing thousands of test cases, resulting in information overload. Aim. We therefore develop a tool that supports the analysis of test results. Method. We create NIOCAT, a tool that clusters similar test case failures, to help the analyst identify underlying causes. To evaluate the tool, experiments on manually created subsets of failed test cases representing different use cases are conducted, and a focus group meeting is held with test analysts at Qlik. Results. The case study shows that NIOCAT creates accurate clusters, in line with analyses performed by human analysts. Further, the potential time-savings of our approach is confirmed by the participants in the focus group. Conclusions. NIOCAT provides a feasible complement to current automated testing practices at Qlik by reducing information overload.

[1]  Per Runeson,et al.  A replicated study on duplicate detection: using apache lucene to search among Android defects , 2014, ESEM '14.

[2]  Giuliano Antoniol,et al.  Traceability recovery by modeling programmer behavior , 2000, Proceedings Seventh Working Conference on Reverse Engineering.

[3]  Andrea De Lucia,et al.  Incremental Approach and User Feedbacks: a Silver Bullet for Traceability Recovery , 2006, 2006 22nd IEEE International Conference on Software Maintenance.

[4]  Yixin Chen,et al.  Clustering of Defect Reports Using Graph Partitioning Algorithms , 2009, SEKE.

[5]  Chengying Mao,et al.  Extracting the Representative Failure Executions via Clustering Analysis Based on Markov Profile Model , 2005, ADMA.

[6]  Bengt Karlöf,et al.  Benchmarking , 1998, Performance.

[7]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[8]  Mark Harman,et al.  Clustering test cases to achieve effective and scalable prioritisation incorporating expert knowledge , 2009, ISSTA.

[9]  Mira Mezini,et al.  Finding Duplicates of Your Yet Unwritten Bug Report , 2013, 2013 17th European Conference on Software Maintenance and Reengineering.

[10]  Per Runeson,et al.  Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability , 2013, Empirical Software Engineering.

[11]  Per Runeson,et al.  A survey of unit testing practices , 2006, IEEE Software.

[12]  Mark J. Embrechts,et al.  On the Use of the Adjusted Rand Index as a Metric for Evaluating Supervised Classification , 2009, ICANN.

[13]  Per Runeson,et al.  A Qualitative Survey of Regression Testing Practices , 2010, PROFES.

[14]  Per Runeson,et al.  Guidelines for conducting and reporting case study research in software engineering , 2009, Empirical Software Engineering.

[15]  James A. Jones,et al.  Software Behavior and Failure Clustering: An Empirical Study of Fault Causality , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[16]  Per Runeson,et al.  Detection of Duplicate Defect Reports Using Natural Language Processing , 2007, 29th International Conference on Software Engineering (ICSE'07).

[17]  A.J.C. van Gemund,et al.  On the Accuracy of Spectrum-based Fault Localization , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[18]  Charles Yang,et al.  Partition testing, stratified sampling, and cluster analysis , 1993, SIGSOFT '93.

[19]  Tao Xie,et al.  An approach to detecting duplicate bug reports using natural language and execution information , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[20]  Alan Said,et al.  Benchmarking - A Methodology for Ensuring the Relative Quality of Recommendation Systems in Software Engineering , 2014, Recommendation Systems in Software Engineering.

[21]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[22]  Baowen Xu,et al.  Using semi-supervised clustering to improve regression test selection techniques , 2011, 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

[23]  Bin Wang,et al.  Automated support for classifying software failure reports , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[24]  M. Cugmas,et al.  On comparing partitions , 2015 .

[25]  Ilene Burnstein,et al.  Practical Software Testing , 2003, Springer Professional Computing.

[26]  Luc Pronzato,et al.  Design of computer experiments: space filling and beyond , 2011, Statistics and Computing.

[27]  Martin P. Robillard,et al.  Recommendation Systems for Software Engineering , 2010, IEEE Software.

[28]  M. Felisa Verdejo,et al.  Applying Linguistic Engineering to Spatial Software Engineering: the Traceabiiity Problem , 1992, ECAI.