Regression Identification of Coincidental Correctness via Weighted Clustering

Coverage-based fault localization techniques leverage coverage information to identify the suspicious program entities for inspection. However, coincidental correctness (CC) widely occurs during software debugging, and brings negative impact to the effectiveness of CBFL techniques. In this paper, we propose a regression approach to identity CC execution with weighted clustering analysis. Based on the observation that program entities with different suspiciousness have different contributions to identify coincidental correctness, we make use of the suspiciousness calculated by CBFL techniques as the weight of each program entity and conduct weighted clustering to identify coincidental correctness regressively. To evaluate the effectiveness of our approach, we construct controlled experiments built on benchmark programs, and the experimental results show that our approach is able to improve the accuracy of the identification of coincidental correctness executions and further improve the effectiveness of CBFL techniques.

[1]  Chao Liu,et al.  SOBER: statistical model-based bug localization , 2005, ESEC/FSE-13.

[2]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI '03.

[3]  Aritra Bandyopadhyay Mitigating the Effect of Coincidental Correctness in Spectrum Based Fault Localization , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[4]  Fadi A. Zaraket,et al.  Enhancing Fault Localization via Multivariate Visualization , 2012, 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation.

[5]  Mary Jean Harrold,et al.  An empirical study of the effects of test-suite reduction on fault localization , 2008, 2008 ACM/IEEE 30th International Conference on Software Engineering.

[6]  John T. Stasko,et al.  Visualization of test information to assist fault localization , 2002, ICSE '02.

[7]  Fadi A. Zaraket,et al.  Does Principal Component Analysis Improve Cluster-Based Analysis? , 2013, 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation Workshops.

[8]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[9]  Iris Vessey,et al.  Expertise in Debugging Computer Programs: A Process Analysis , 1984, Int. J. Man Mach. Stud..

[10]  Chao Liu,et al.  Identifying Coincidental Correctness in Fault Localization via Cluster Analysis , 2014 .

[11]  Peter Zoeteweij,et al.  A practical evaluation of spectrum-based fault localization , 2009, J. Syst. Softw..

[12]  Wes Masri,et al.  Prevalence of coincidental correctness and mitigation of its impact on fault localization , 2014, TSEM.

[13]  A.J.C. van Gemund,et al.  On the Accuracy of Spectrum-based Fault Localization , 2007, Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007).

[14]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[15]  Wes Masri,et al.  An empirical study of the factors that reduce the effectiveness of coverage-based fault localization , 2009, DEFECTS '09.

[16]  Olivier Ridoux,et al.  Data Mining and Cross-checking of Execution Traces: A re-interpretation of Jones, Harrold and Stasko test information visualization (Long version) , 2005 .

[17]  Eric A. Brewer,et al.  Pinpoint: problem determination in large, dynamic Internet services , 2002, Proceedings International Conference on Dependable Systems and Networks.

[18]  Brent Hailpern,et al.  Software debugging, testing, and verification , 2002, IBM Syst. J..

[19]  Yuming Zhou,et al.  Identifying Coincidental Correctness for Fault Localization by Clustering Test Cases , 2012, SEKE.