Mining Logs to Model the Use of a System

Background. Process mining is a technique to build process models from "execution logs" (i.e., events triggered by the execution of a process). State-of-the-art tools can provide process managers with different graphical representations of such models. Managers use these models to compare them with an ideal process model or to support process improvement. They typically select the representation based on their experience and knowledge of the system. Aim. This work studies how to automatically build process models representing the actual intents (or uses) of users while interacting with a software system. Such intents are expressed as a set of actions performed by a user to a system to achieve specific use goals. Method. This work applies the theory of Hidden Markov Models to mine use logs and automatically model the use of a system. Results. Unlike the models generated with process mining tools, the Hidden Markov Models automatically generated in this study provide the intents of a user and can be used to recommend managers with a faithful representation of the use of their systems. Conclusions. The automatic generation of the Hidden Markov Models can achieve a good level of accuracy in representing the actual user's intents provided the log dataset is carefully chosen. In our study, the information contained in one-month set of logs helped automatically build Hidden Markov Models with superior accuracy and similar expressiveness of the models built together with the company's stakeholder.

[1]  Wil M. P. van der Aalst,et al.  Process Mining: Overview and Opportunities , 2012, ACM Trans. Manag. Inf. Syst..

[2]  Wil M.P. van der Aalst,et al.  Fuzzy Mining - Adaptive Process Simplification Based on Multi-perspective Metrics , 2007, BPM.

[3]  Glenn A. Fink,et al.  Predicting Computer System Failures Using Support Vector Machines , 2008, WASL.

[4]  Saulius Astromskis,et al.  A process mining approach to measure how users interact with software: an industrial case study , 2015, ICSSP.

[5]  Lori L. Pollock,et al.  Interactive Exploration of Developer Interaction Traces using a Hidden Markov Model , 2016, 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR).

[6]  Cheng-Zhong Xu,et al.  Exploring event correlation for failure prediction in coalitions of clusters , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[7]  Kenji Yamanishi,et al.  Dynamic syslog mining for network failure monitoring , 2005, KDD '05.

[8]  Rich Salz,et al.  A Universally Unique IDentifier (UUID) URN Namespace , 2005, RFC.

[9]  Wil M. P. van der Aalst,et al.  Decision Mining in ProM , 2006, Business Process Management.

[10]  Thomas Zimmermann,et al.  Information needs for software development analytics , 2012, 2012 34th International Conference on Software Engineering (ICSE).

[11]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[12]  Wei Xu,et al.  Advances and challenges in log analysis , 2011, Commun. ACM.

[13]  Witold Pedrycz,et al.  Mining system logs to learn error predictors: a case study of a telemetry system , 2014, Empirical Software Engineering.

[14]  Malgorzata Steinder,et al.  Probabilistic fault localization in communication systems using belief networks , 2004, IEEE/ACM Transactions on Networking.