Automated Problem Determination Using Call-Stack Matching

We present an architecture and algorithms for performing automated software problem determination using call-stack matching. In an environment where software is used by a large user community, the same problem may re-occur many times. We show that this can be detected by matching the program call-stack against a historical database of call-stacks, so that as soon as the problem has been resolved once, future cases of the same or similar problems can be automatically resolved. This would greatly reduce the number of cases that need to be dealt with by human support analysts. We also show how a call-stack matching algorithm can be automatically learned from a small sample of call-stacks labeled by human analysts, and examine the performance of this learning algorithm on two different data sets.

[1]  Timothy L. Acorn,et al.  Smart: Support: Management Automated Reasoning Technology for Compaq Customer Service , 1992, IAAI.

[2]  Bin Wang,et al.  Automated support for classifying software failure reports , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[3]  Michael I. Jordan,et al.  Sampling User Executions for Bug Isolation , 2003 .

[4]  William G. Griswold,et al.  Dynamically discovering likely program invariants to support program evolution , 1999, Proceedings of the 1999 International Conference on Software Engineering (IEEE Cat. No.99CB37002).

[5]  Allen D. Malony,et al.  Instrumentation and Measurement Strategies for Flexible and Portable Empirical Performance Evaluation , 2001 .

[6]  Weibo Gong,et al.  Anomaly detection using call stack information , 2003, 2003 Symposium on Security and Privacy, 2003..

[7]  John Lambert,et al.  xdProf: a tool for the capture and analysis of stack traces in a distributed Java system , 2001, ITCom.

[8]  John Lambert Using stack traces to identify failed executions in a Java distributed system , 2002 .

[9]  Andreas Zeller,et al.  Simplifying and Isolating Failure-Inducing Input , 2002, IEEE Trans. Software Eng..

[10]  Ulrich W. Eisenecker,et al.  AI: The Tumultuous History of the Search for Artificial Intelligence , 1995 .

[11]  Andreas Zeller,et al.  Finding Failure Causes through Automated Testing , 2000, AADEBUG.

[12]  Michael I. Jordan,et al.  Failure diagnosis using decision trees , 2004 .

[13]  M. Lam,et al.  Tracking down software bugs using automatic anomaly detection , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.

[14]  Lundy Lewis Managing Computer Networks: A Case-Based Reasoning Approach , 1995 .

[15]  Tao Li,et al.  Mining Patterns from Case Base Analysis , 2007 .

[16]  Jong-Deok Choi,et al.  Isolating failure-inducing thread schedules , 2002, ISSTA '02.