Improved error reporting for software that uses black-box components

An error occurs when software cannot complete a requested action as a result of some problem with its input, configuration, or environment. A high-quality error report allows a user to understand and correct the problem. Unfortunately, the quality of error reports has been decreasing as software becomes more complex and layered. End-users take the cryptic error messages given to them by programsand struggle to fix their problems using search engines and support websites. Developers cannot improve their error messages when they receive an ambiguous or otherwise insufficient error indicator from a black-box software component. We introduce Clarify, a system that improves error reporting by classifying application behavior. Clarify uses minimally invasive monitoring to generate a behavior profile, which is a summary of the program's execution history. A machine learning classifier uses the behavior profile to classify the application's behavior, thereby enabling a more precise error report than the output of the application itself. We evaluate a prototype Clarify system on ambiguous error messages generated by large, modern applications like gcc, La-TeX, and the Linux kernel. For a performance cost of less than 1% on user applications and 4.7% on the Linux kernel, the proto type correctly disambiguates at least 85% of application behaviors that result in ambiguous error reports. This accuracy does not degrade significantly with more behaviors: a Clarify classifier for 81 La-TeX error messages is at most 2.5% less accurate than a classifier for 27 LaTeX error messages. Finally, we show that without any human effort to build a classifier, Clarify can provide nearest-neighbor software support, where users who experience a problem are told about 5 other users who might have had the same problem. On average 2.3 of the 5 users that Clarify identifies have experienced the same problem.

[1]  Helen J. Wang,et al.  Automatic Misconfiguration Troubleshooting with PeerPressure , 2004, OSDI.

[2]  Wei-Ying Ma,et al.  Combining High Level Symptom Descriptions and Low Level State Information for Configuration Fault Diagnosis , 2004, LISA.

[3]  Bin Wang,et al.  Automated support for classifying software failure reports , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[4]  RepsThomas,et al.  The use of program profiling for software maintenance with applications to the year 2000 problem , 1997 .

[5]  James R. Larus,et al.  The use of program profiling for software maintenance with applications to the year 2000 problem , 1997, ESEC '97/FSE-5.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  Barton P. Miller,et al.  Fine-grained dynamic instrumentation of commodity operating system kernels , 1999, OSDI '99.

[8]  M. Lam,et al.  Tracking down software bugs using automatic anomaly detection , 2002, Proceedings of the 24th International Conference on Software Engineering. ICSE 2002.

[9]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[10]  ChoiJong-Deok,et al.  Accurate, efficient, and adaptive calling context profiling , 2006 .

[11]  Sheng Ma,et al.  Quickly Finding Known Software Problems via Automated Symptom Matching , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[12]  Yuriy Brun,et al.  Finding latent code errors via machine learning over program executions , 2004, Proceedings. 26th International Conference on Software Engineering.

[13]  Stephanie Forrest,et al.  Automated response using system-call delays , 2000 .

[14]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[15]  Gregg Rothermel,et al.  An empirical investigation of the relationship between spectra differences and regression faults , 2000 .

[16]  Michael I. Jordan,et al.  Scalable statistical bug isolation , 2005, PLDI '05.

[17]  Sudheendra Hangal,et al.  Tracking down software bugs using automatic anomaly detection , 2002, ICSE '02.

[18]  Michael I. Jordan,et al.  Bug isolation via remote program sampling , 2003, PLDI.

[19]  Anant Agarwal,et al.  TraceBack: first fault diagnosis by reconstruction of distributed control flow , 2005, PLDI '05.

[20]  Eser Kandogan,et al.  Field studies of computer system administrators: analysis of system management tools and practices , 2004, CSCW.

[21]  Jason V. Davis,et al.  Cost-Sensitive Decision Tree Learning for Forensic Classification , 2006, ECML.

[22]  Thomas J. Ostrand,et al.  Experiments on the effectiveness of dataflow- and control-flow-based test adequacy criteria , 1994, Proceedings of 16th International Conference on Software Engineering.

[23]  James R. Larus,et al.  Exploiting hardware performance counters with flow and context sensitive profiling , 1997, PLDI '97.

[24]  Jong-Deok Choi,et al.  Accurate, efficient, and adaptive calling context profiling , 2006, PLDI '06.

[25]  Marcos K. Aguilera,et al.  Performance debugging for distributed systems of black boxes , 2003, SOSP '03.

[26]  Vinod Ganapathy,et al.  HeapMD: identifying heap-based bugs using anomaly detection , 2006, ASPLOS XII.

[27]  Vitaly Shmatikov,et al.  Privacy-preserving remote diagnostics , 2007, CCS '07.

[28]  James R. Larus,et al.  Efficient path profiling , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.

[29]  Wei-Ying Ma,et al.  Automated known problem diagnosis with event traces , 2006, EuroSys.

[30]  James M. Rehg,et al.  Active learning for automatic classification of software behavior , 2004, ISSTA '04.

[31]  Derek Bruening,et al.  An infrastructure for adaptive dynamic optimization , 2003, International Symposium on Code Generation and Optimization, 2003. CGO 2003..

[32]  Chao Liu,et al.  Mining Behavior Graphs for "Backtrace" of Noncrashing Bugs , 2005, SDM.