Exploring program phases for statistical bug localization

Statistical bug isolation techniques attempt to capture a correlation of various program features (like predicates and profiled paths) for debugging. These techniques collect profile data for multiple executions, both with successful and faulty runs, and propose using various statistical tests to capture this correlation. In this paper, we explore the utility of program phases, a concept which is primarily used by computer architects to speed up architectural simulations, for statistical bug isolation. Program phases represent sets of execution intervals in a program's execution where the rates of architectural statistics like branch mispredictions, CPU/Memory usage and cache misses remain almost the same. We found multiple scenarios where coupling program phases with predicates achieves higher accuracy to bug localization than when predicates are used alone. We demonstrate the use of program phases for bug isolation by presenting experimental results and concrete case studies on medium-size programs, showing an improved ranking of the program points that are critical to debugging over when program phases are not used.

[1]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[2]  Alex Aiken,et al.  Cooperative Bug Isolation , 2007 .

[3]  I JordanMichael,et al.  Scalable statistical bug isolation , 2005 .

[4]  Andy Podgurski,et al.  The Probabilistic Program Dependence Graph and Its Application to Fault Diagnosis , 2008, IEEE Transactions on Software Engineering.

[5]  Ben Liblit,et al.  Adaptive bug isolation , 2010, 2010 ACM/IEEE 32nd International Conference on Software Engineering.

[6]  Lei Zhao,et al.  Context-Aware Fault Localization via Control Flow Analysis , 2011, J. Softw..

[7]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[8]  Joe D. Warren,et al.  The program dependence graph and its use in optimization , 1987, TOPL.

[9]  Chao Liu,et al.  Statistical Debugging: A Hypothesis Testing-Based Approach , 2006, IEEE Transactions on Software Engineering.

[10]  Steven P. Reiss,et al.  Fault localization with nearest neighbor queries , 2003, 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings..

[11]  Ting Chen,et al.  Statistical debugging using compound boolean predicates , 2007, ISSTA '07.

[12]  Peter Deutsch,et al.  DEFLATE Compressed Data Format Specification version 1.3 , 1996, RFC.

[13]  Gregg Rothermel,et al.  Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact , 2005, Empirical Software Engineering.

[14]  Brad Calder,et al.  Discovering and Exploiting Program Phases , 2003, IEEE Micro.

[15]  Trishul M. Chilimbi,et al.  HOLMES: Effective statistical debugging via efficient path profiling , 2009, 2009 IEEE 31st International Conference on Software Engineering.

[16]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[17]  Thomas W. Reps,et al.  CodeSurfer/x86-A Platform for Analyzing x86 Executables , 2005, CC.

[18]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..