Program behavior characterization and clustering: An empirical study for failure clustering

Failure clustering is considered as an effective method to alleviate the burden in software development and maintenance stage. However, since the overall software fault space is extremely large, the inherent complexity of the “fault-error-failure” chain becomes an obstacle in failure clustering. In this paper, we present a method of program behavior characterization and clustering which is able to examine and cluster failure behaviors of programs based on their normal executions. We first characterize program executions in order to model runtime behaviors. Then the runtime behaviors are clustered by using a typical fuzzy technique. After that, we evaluate two things: the accuracy of runtime behavior modeling, and the equivalence of a cluster in runtime characterization to that in failure clustering. For the SPEC CPU2000 and SPEC CPU2006 suites of benchmarks, the experimental results and analysis show that our method is effective at clustering similar failure behaviors based on their runtime behavior clustering.

[1]  Mary Jean Harrold,et al.  Debugging in Parallel , 2007, ISSTA '07.

[2]  Cedric Nishan Canagarajah,et al.  A robust automatic clustering scheme for image segmentation using wavelets , 1996, IEEE Trans. Image Process..

[3]  Henrique Madeira,et al.  Emulation of Software Faults: A Field Data Study and a Practical Approach , 2006, IEEE Transactions on Software Engineering.

[4]  Scott N. Woodfield,et al.  Evaluating the effectiveness of reliability-assurance techniques , 1989, J. Syst. Softw..

[5]  Wenliang Du,et al.  Testing for software vulnerability using environment perturbation , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[6]  Fred S. Roberts,et al.  A measure of discrepancy of multiple sequences , 2001, Inf. Sci..

[7]  Harish Patil,et al.  Pin: building customized program analysis tools with dynamic instrumentation , 2005, PLDI '05.

[8]  Kathleen J. Mullen,et al.  Agricultural Policies in India , 2018, OECD Food and Agricultural Reviews.

[9]  Jian Yu,et al.  A novel fuzzy clustering algorithm based on a fuzzy scatter matrix with optimality tests , 2005, Pattern Recognit. Lett..

[10]  David Leon,et al.  Finding failures by cluster analysis of execution profiles , 2001, Proceedings of the 23rd International Conference on Software Engineering. ICSE 2001.

[11]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[12]  James A. Jones,et al.  Fault interaction and its repercussions , 2011, 2011 27th IEEE International Conference on Software Maintenance (ICSM).

[13]  Bin Wang,et al.  Automated support for classifying software failure reports , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[14]  Charles Yang,et al.  Estimation of software reliability by stratified sampling , 1999, TSEM.

[15]  Daniel P. Siewiorek,et al.  Observations on the Effects of Fault Manifestation as a Function of Workload , 1992, IEEE Trans. Computers.

[16]  Chao Liu,et al.  Failure proximity: a fault localization-based approach , 2006, SIGSOFT '06/FSE-14.