Bug Signature Minimization and Fusion

Debugging is a time-consuming activity. To help in debugging, many approaches have been proposed to pinpoint the location of errors given labeled failures and correct executions. While such approaches have been shown to be accurate, at times the location alone is not sufficient in helping programmers understand why the bug happens and how to fix it. Furthermore, a single location might not be powerful enough to discriminate failures from correct executions. To address the above challenges, there have been recent studies on extracting bug signatures which are composed of multiple locations appearing together in a particular order signifying an occurrence of a bug. The latest study on bug signatures by Cheng et al. models program executions as graphs. Two sets of graphs corresponding to failures and correct executions are then contrasted to extract the most discriminative connected sub graphs serving as bug signatures. However, there are two limitations: (1) returned signatures might not be minimal and (2) they can only capture localized bug context. In this work, we develop a signature minimization technique to capture minimal discriminative signatures. Also, we propose a technique of signature fusion to fuse disconnected sub graphs so that our method can capture bug contexts spanning multiple locations. Experimental study on Siemens and Space dataset shows the effectiveness of the proposed bug signature minimization and fusion techniques. Comparing with the state-of-the-art bug signature mining technique, we reduce the number of bugs missed by up to 57.7%, and reduce the average number of nodes traversed by up to 85.6%.

[1]  David Lo,et al.  Search-based fault localization , 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011).

[2]  Hong Cheng,et al.  Identifying bug signatures using discriminative graph mining , 2009, ISSTA.

[3]  Gregory Tassey,et al.  Prepared for what , 2007 .

[4]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[5]  Gregg Rothermel,et al.  Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact , 2005, Empirical Software Engineering.

[6]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[7]  Chao Liu,et al.  SOBER: statistical model-based bug localization , 2005, ESEC/FSE-13.

[8]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[9]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[10]  Alessandro Orso,et al.  Rapid: Identifying Bug Signatures to Support Debugging Activities , 2008, 2008 23rd IEEE/ACM International Conference on Automated Software Engineering.

[11]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI.

[12]  H. Cleve,et al.  Locating causes of program failures , 2005, Proceedings. 27th International Conference on Software Engineering, 2005. ICSE 2005..

[13]  David Lo,et al.  Comprehensive evaluation of association measures for fault localization , 2010, 2010 IEEE International Conference on Software Maintenance.

[14]  Mary Jean Harrold,et al.  Empirical evaluation of the tarantula automatic fault-localization technique , 2005, ASE.

[15]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..