Process discovery from event data: Relating models and logs through abstractions

Event data are collected in logistics, manufacturing, finance, healthcare, customer relationship management, e-learning, e-government, and many other domains. The events found in these domains typically refer to activities executed by resources at particular times and for a particular case (i.e., process instances). Process mining techniques are able to exploit such data. In this article, we focus on process discovery. However, process mining also includes conformance checking, performance analysis, decision mining, organizational mining, predictions, recommendations, etc. These techniques help to diagnose problems and improve processes. All process mining techniques involve both event data and process models. Therefore, a typical first step is to automatically learn a control-flow model from the event data. This is very challenging, but in recent years many powerful discovery techniques have been developed. It is not easy to compare these techniques since they use different representations and make different assumptions. Users often need to resort to trying different algorithms in an ad-hoc manner. Developers of new techniques are often trying to solve specific instances of a more general problem. Therefore, we aim to unify existing approaches by focusing on log and model abstractions. These abstractions link observed and modeled behavior: Concrete behaviors recorded in event logs are related to possible behaviors represented by process models. Hence, such behavioral abstractions provide an “interface” between both. We discuss four discovery approaches involving three abstractions and different types of process models (Petri nets, block-structured models, and declarative models). The goal is to provide a comprehensive understanding of process discovery and show how to develop new techniques. Examples illustrate the different approaches and pointers to software are given. The discussion on abstractions and process representations is also used to reflect on the gap between process mining literature and commercial process mining tools. This facilitates users to select an appropriate process discovery technique. Moreover, structuring the role of internal abstractions and representations helps to broaden the view and facilitates the creation of new discovery approaches. ∗Process and Data Science (PADS), RWTH Aachen University, Aachen, Germany

[1]  Marco Montali Specification and Verification of Declarative Open Interaction Models: A Logic-Based Approach , 2010 .

[2]  Boudewijn F. van Dongen,et al.  Workflow mining: A survey of issues and approaches , 2003, Data Knowl. Eng..

[3]  Paola Mello,et al.  Monitoring business constraints with the event calculus , 2013, ACM Trans. Intell. Syst. Technol..

[4]  Wil M.P. van der Aalst,et al.  Fuzzy Mining - Adaptive Process Simplification Based on Multi-perspective Metrics , 2007, BPM.

[5]  Hajo A. Reijers,et al.  Heuristic Mining Revamped: An Interactive, Data-aware, and Conformance-aware Miner , 2017, BPM.

[6]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[7]  Carl H. Smith,et al.  Inductive Inference: Theory and Methods , 1983, CSUR.

[8]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[9]  Wil M. P. van der Aalst,et al.  Workflow Mining: Current Status and Future Directions , 2003, OTM.

[10]  Jianmin Wang,et al.  A novel approach for process mining based on event types , 2007, IEEE International Conference on Services Computing (SCC 2007).

[11]  Anindya Datta,et al.  Automating the Discovery of AS-IS Business Process Models: Probabilistic and Algorithmic Approaches , 1998, Inf. Syst. Res..

[12]  Boudewijn F. van Dongen,et al.  Process Discovery using Integer Linear Programming , 2009, Fundamenta Informaticae.

[13]  Ashutosh Tiwari,et al.  A review of business process mining: state-of-the-art and future trends , 2008, Bus. Process. Manag. J..

[14]  Sander J. J. Leemans,et al.  Discovering Block-Structured Process Models from Event Logs Containing Infrequent Behaviour , 2013, Business Process Management Workshops.

[15]  Wil M. P. van der Aalst,et al.  Declarative workflows: Balancing between flexibility and support , 2009, Computer Science - Research and Development.

[16]  Massimo Mecella,et al.  A two-step fast algorithm for the automated discovery of declarative workflows , 2013, 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[17]  Bart Baesens,et al.  A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs , 2012, Inf. Syst..

[18]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[19]  Wil M. P. van der Aalst,et al.  Rediscovering workflow models from event-based data using little thumb , 2003, Integr. Comput. Aided Eng..

[20]  A. Nerode,et al.  Linear automaton transformations , 1958 .

[21]  A Anne Rozinat,et al.  Process mining : conformance and extension , 2010 .

[22]  Sander J. J. Leemans,et al.  Scalable process discovery and conformance checking , 2016, Software & Systems Modeling.

[23]  Geert Poels,et al.  Process Mining and the ProM Framework: An Exploratory Survey , 2012, Business Process Management Workshops.

[24]  Sander J. J. Leemans,et al.  Discovering Block-Structured Process Models from Event Logs - A Constructive Approach , 2013, Petri Nets.

[25]  Wil M. P. van der Aalst,et al.  Learning Hybrid Process Models from Events - Process Discovery Without Faking Confidence , 2017, BPM.

[26]  Massimo Mecella,et al.  Automated Discovery of Process Models from Event Logs: Review and Benchmark , 2017, IEEE Transactions on Knowledge and Data Engineering.

[27]  Boudewijn F. van Dongen,et al.  Process mining: a two-step approach to balance between underfitting and overfitting , 2008, Software & Systems Modeling.

[28]  Dimitrios Gunopulos,et al.  Mining Process Models from Workflow Logs , 1998, EDBT.

[29]  Wil M. P. van der Aalst,et al.  A Knowledge-Based Integrated Approach for Discovering and Repairing Declare Maps , 2013, CAiSE.

[30]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[31]  R. Agrawal,et al.  Research Report Mining Sequential Patterns: Generalizations and Performance Improvements Limited Distribution Notice Mining Sequential Patterns: Generalizations and Performance Improvements , 1996 .

[32]  Wil M. P. van der Aalst,et al.  DecSerFlow: Towards a Truly Declarative Service Flow Language , 2006, WS-FM.

[33]  Jianmin Wang,et al.  Mining process models with non-free-choice constructs , 2007, Data Mining and Knowledge Discovery.

[34]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[35]  Robin Bergenthum,et al.  Process Mining Based on Regions of Languages , 2007, BPM.

[36]  A. J. M. M. Weijters,et al.  Flexible Heuristics Miner (FHM) , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[37]  Josep Carmona,et al.  Process Mining from a Basis of State Regions , 2010, Petri Nets.

[38]  Mathias Weske,et al.  Business Process Management: Concepts, Languages, Architectures , 2007 .

[39]  Michèle Sebag,et al.  A Recommender System for Process Discovery , 2014, BPM.

[40]  Luciano Lavagno,et al.  Deriving Petri Nets for Finite Transition Systems , 1998, IEEE Trans. Computers.

[41]  Wil M. P. van der Aalst,et al.  Process Mining , 2016, Springer Berlin Heidelberg.

[42]  W.M.P. van der Aalst,et al.  Business Process Management: A Comprehensive Survey , 2013 .

[43]  Wil M. P. van der Aalst,et al.  Efficient Discovery of Understandable Declarative Process Models from Event Logs , 2012, CAiSE.

[44]  Sander J. J. Leemans,et al.  Scalable Process Discovery with Guarantees , 2015, BMMDS/EMMSAD.

[45]  Wil M. P. van der Aalst,et al.  Automatic Discovery of Object-Centric Behavioral Constraint Models , 2017, BIS.

[46]  Boudewijn F. van Dongen,et al.  Replaying history on process models for conformance checking and performance analysis , 2012, WIREs Data Mining Knowl. Discov..

[47]  Joachim Herbst,et al.  A Machine Learning Approach to Workflow Management , 2000, ECML.

[48]  Cristóbal Romero,et al.  A survey on educational process mining , 2018, WIREs Data Mining Knowl. Discov..

[49]  Jerome A. Feldman,et al.  On the Synthesis of Finite-State Machines from Samples of Their Behavior , 1972, IEEE Transactions on Computers.

[50]  Sander J. J. Leemans,et al.  Discovering Block-Structured Process Models from Incomplete Event Logs , 2014, Petri Nets.

[51]  Marco Montali,et al.  Monitoring Business Constraints with Linear Temporal Logic: An Approach Based on Colored Automata , 2011, BPM.

[52]  Josep Carmona,et al.  New Region-Based Algorithms for Deriving Bounded Petri Nets , 2010, IEEE Transactions on Computers.

[53]  Marco Montali,et al.  Specification and Verification of Declarative Open Interaction Models - A Logic-Based Approach , 2010, Lecture Notes in Business Information Processing.

[54]  A. Akhmetova Discovery of Frequent Episodes in Event Sequences , 2006 .

[55]  Andrzej Ehrenfeucht,et al.  Partial (set) 2-structures , 1990, Acta Informatica.