Discovering Hidden Errors from Application Log Traces with Process Mining

Over the past decades logs have been widely used for detecting and analyzing failures of computer applications. Nevertheless, it is widely accepted by the scientific community that failures might go undetected in the logs. This paper proposes a measurement study with a dataset of 3,794 log traces obtained from normative and failure runs of the Apache web server. We use process mining (i) to infer a model of the normative log behavior, e.g., presence and ordering of messages in the traces, and (ii) to detect failures within arbitrary traces by looking for deviations from the model (conformance checking). Analysis is done with the Integer Linear Programming (ILP) Miner, Inductive Miner and Alpha++ Miner algorithms. Our measurements indicate that, although only around 18% failure traces contain explicit error keywords and phrases, conformance checking allows detecting up to 87% failures at high precision, which means that most of the errors are hidden across the traces.

[1]  Domenico Cotroneo,et al.  Automated root cause identification of security alerts: Evaluation in a SaaS Cloud , 2016, Future Gener. Comput. Syst..

[2]  Sander J. J. Leemans,et al.  Discovering Block-Structured Process Models from Event Logs - A Constructive Approach , 2013, Petri Nets.

[3]  Wei Xu,et al.  Advances and challenges in log analysis , 2011, Commun. ACM.

[4]  Jon Stearley,et al.  What Supercomputers Say: A Study of Five System Logs , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[5]  Domenico Cotroneo,et al.  Characterizing Direct Monitoring Techniques in Software Systems , 2016, IEEE Transactions on Reliability.

[6]  Henrique Madeira,et al.  Emulation of Software Faults: A Field Data Study and a Practical Approach , 2006, IEEE Transactions on Software Engineering.

[7]  Boudewijn F. van Dongen,et al.  Process Discovery using Integer Linear Programming , 2009, Fundamenta Informaticae.

[8]  Boudewijn F. van Dongen,et al.  The ProM Framework: A New Era in Process Mining Tool Support , 2005, ICATPN.

[9]  Ravishankar K. Iyer,et al.  Failure data analysis of a LAN of Windows NT based computers , 1999, Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems.

[10]  Jianmin Wang,et al.  Mining process models with non-free-choice constructs , 2007, Data Mining and Knowledge Discovery.

[11]  Domenico Cotroneo,et al.  On Fault Representativeness of Software Fault Injection , 2013, IEEE Transactions on Software Engineering.