Process Mining in the Large: A Tutorial

Recently, process mining emerged as a new scientific discipline on the interface between process models and event data. On the one hand, conventional Business Process Management (BPM) and Workflow Management (WfM) approaches and tools are mostly model-driven with little consideration for event data. On the other hand, Data Mining (DM), Business Intelligence (BI), and Machine Learning (ML) focus on data without considering end-to-end process models. Process mining aims to bridge the gap between BPM and WfM on the one hand and DM, BI, and ML on the other hand. Here, the challenge is to turn torrents of event data (“Big Data”) into valuable insights related to process performance and compliance. Fortunately, process mining results can be used to identify and understand bottlenecks, inefficiencies, deviations, and risks. This tutorial paper introduces basic process mining techniques that can be used for process discovery and conformance checking. Moreover, some very general decomposition results are discussed. These allow for the decomposition and distribution of process discovery and conformance checking problems, thus enabling process mining in the large.

[1]  Boudewijn F. van Dongen,et al.  Workflow mining: A survey of issues and approaches , 2003, Data Knowl. Eng..

[2]  Yurdaer N. Doganata,et al.  Business Provenance - A Technology to Increase Traceability of End-to-End Operations , 2008, OTM Conferences.

[3]  Wil M.P. van der Aalst,et al.  Fuzzy Mining - Adaptive Process Simplification Based on Multi-perspective Metrics , 2007, BPM.

[4]  Boudewijn F. van Dongen,et al.  Alignment Based Precision Checking , 2012, Business Process Management Workshops.

[5]  Boudewijn F. van Dongen,et al.  Towards Robust Conformance Checking , 2010, Business Process Management Workshops.

[6]  Robin Bergenthum,et al.  Process Mining Based on Regions of Languages , 2007, BPM.

[7]  Wil M. P. van der Aalst,et al.  Pattern-Based Translation of BPMN Process Models to BPEL Web Services , 2008, Int. J. Web Serv. Res..

[8]  Wil M.P. van der Aalst,et al.  Multi-Phase Process Mining : Aggregating Instance Graphs into EPCs and Petri Nets , 2005 .

[9]  Josep Carmona,et al.  Process Mining from a Basis of State Regions , 2010, Petri Nets.

[10]  Boudewijn F. van Dongen,et al.  Cycle Time Prediction: When Will This Case Finally Be Finished? , 2008, OTM Conferences.

[11]  Boudewijn F. van Dongen,et al.  Replaying history on process models for conformance checking and performance analysis , 2012, WIREs Data Mining Knowl. Discov..

[12]  Wineke A. M. van Lent,et al.  Similarity of business process models : metrics and evaluation , 2009 .

[13]  W.M.P. van der Aalst,et al.  Business Process Management: A Comprehensive Survey , 2013 .

[14]  Boudewijn F. van Dongen,et al.  Process Discovery using Integer Linear Programming , 2009, Fundamenta Informaticae.

[15]  Hajo A. Reijers Case Prediction in BPM Systems : A Research Challenge , 2007 .

[16]  Boudewijn F. van Dongen,et al.  Process mining: a two-step approach to balance between underfitting and overfitting , 2008, Software & Systems Modeling.

[17]  Joachim Herbst,et al.  A Machine Learning Approach to Workflow Management , 2000, ECML.

[18]  Boudewijn F. van Dongen,et al.  Business process mining: An industrial application , 2007, Inf. Syst..

[19]  Wil M. P. van der Aalst,et al.  Time prediction based on process mining , 2011, Inf. Syst..

[20]  Dimitrios Gunopulos,et al.  Mining Process Models from Workflow Logs , 1998, EDBT.

[21]  Tao Jin,et al.  Efficient Retrieval of Similar Workflow Models Based on Behavior , 2012, APWeb.

[22]  Jan Vanthienen,et al.  IEEE Task force on process mining , 2011 .

[23]  Boudewijn F. van Dongen,et al.  Multi-phase Process Mining: Building Instance Graphs , 2004, ER.

[24]  Jussi Vanhatalo,et al.  Simplified Computation and Generalization of the Refined Process Structure Tree , 2010, WS-FM.

[25]  Koen Vanhoof,et al.  A Process Deviation Analysis Framework , 2012, Business Process Management Workshops.

[26]  Amit P. Sheth A new landscape for distributed and parallel data management , 2012, Distributed and Parallel Databases.

[27]  Bart Baesens,et al.  A robust F-measure for evaluating discovered process models , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[28]  Boudewijn F. van Dongen,et al.  Supporting Flexible Processes through Recommendations Based on History , 2008, BPM.

[29]  Bart Baesens,et al.  Robust Process Discovery with Artificial Negative Events , 2009, J. Mach. Learn. Res..

[30]  Boudewijn F. van Dongen,et al.  Cost-Based Fitness in Conformance Checking , 2011, 2011 Eleventh International Conference on Application of Concurrency to System Design.

[31]  Wil M. P. van der Aalst,et al.  Identifying Commonalities and Differences in Object Life Cycles Using Behavioral Inheritance , 2001, ICATPN.

[32]  Martin Hilbert,et al.  The World’s Technological Capacity to Store, Communicate, and Compute Information , 2011, Science.

[33]  Wil M. P. van der Aalst,et al.  Distributed Process Discovery and Conformance Checking , 2012, FASE.

[34]  R. P. Jagadeesh Chandra Bose,et al.  Process mining in the large : preprocessing, discovery, and diagnostics , 2012 .

[35]  Josep Carmona,et al.  A Region-Based Algorithm for Discovering Petri Nets from Event Logs , 2008, BPM.

[36]  Wolfgang Reisig,et al.  Understanding Petri Nets Modeling Techniques, Analysis Methods, Case Studies , 2013, Bull. EATCS.

[37]  Moe Thandar Wynn,et al.  Workflow simulation for operational decision support , 2009, Data Knowl. Eng..

[38]  Wil M. P. van der Aalst,et al.  Decomposing Petri nets for process mining: A generic approach , 2013, Distributed and Parallel Databases.

[39]  Jorge Munoz-Gama,et al.  Enhancing precision in Process Conformance: Stability, confidence and severity , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[40]  Wil M. P. van der Aalst,et al.  Rediscovering workflow models from event-based data using little thumb , 2003, Integr. Comput. Aided Eng..

[41]  Niels Lohmann,et al.  Analyzing Interacting BPEL Processes , 2006, Business Process Management.

[42]  Boudewijn F. van Dongen,et al.  Causal Nets: A Modeling Language Tailored towards Process Discovery , 2011, CONCUR.

[43]  Kees M. van Hee,et al.  Auditing 2.0: Using Process Mining to Support Tomorrow's Auditor , 2010, Computer.

[44]  Marco Montali,et al.  An Operational Decision Support Framework for Monitoring Business Constraints , 2012, FASE.

[45]  Boudewijn F. van Dongen,et al.  On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery , 2012, OTM Conferences.

[46]  Wil M. P. van der Aalst,et al.  Process Mining - Discovery, Conformance and Enhancement of Business Processes , 2011 .

[47]  Dirk Fahland,et al.  Repairing Process Models to Reflect Reality , 2012, BPM.

[48]  Boudewijn F. van Dongen,et al.  Conformance Checking Using Cost-Based Fitness Analysis , 2011, 2011 IEEE 15th International Enterprise Distributed Object Computing Conference.

[49]  Josep Carmona,et al.  A Fresh Look at Precision in Process Conformance , 2010, BPM.

[50]  Moe Thandar Wynn,et al.  Soundness of workflow nets: classification, decidability, and analysis , 2011, Formal Aspects of Computing.

[51]  Wil M. P. van der Aalst,et al.  Genetic process mining: an experimental evaluation , 2007, Data Mining and Knowledge Discovery.

[52]  Wil M. P. van der Aalst,et al.  Formalization and verification of event-driven process chains , 1999, Inf. Softw. Technol..

[53]  Wil M. P. van der Aalst,et al.  Conformance Checking in the Large: Partitioning and Topology , 2013, BPM.

[54]  Wil M. P. van der Aalst,et al.  Beyond Process Mining: From the Past to Present and Future , 2010, CAiSE.

[55]  Tao Jin,et al.  Efficient Retrieval of Similar Business Process Models Based on Structure - (Short Paper) , 2011, OTM Conferences.

[56]  Wil M. P. van der Aalst,et al.  Decision Mining in ProM , 2006, Business Process Management.

[57]  Wil M. P. van der Aalst,et al.  Hierarchical Conformance Checking of Process Models Based on Event Logs , 2013, Petri Nets.

[58]  Remco M. Dijkman,et al.  Graph Matching Algorithms for Business Process Model Similarity Search , 2009, BPM.

[59]  Wil M. P. van der Aalst,et al.  Conformance checking of processes based on monitoring real behavior , 2008, Inf. Syst..

[60]  Josep Carmona,et al.  Process Mining Meets Abstract Interpretation , 2010, ECML/PKDD.

[61]  Wil M. P. van der Aalst,et al.  Translating unstructured workflow processes to readable BPEL: Theory and implementation , 2008, Inf. Softw. Technol..

[62]  Armin Haller,et al.  Log-based transactional workflow mining , 2009, Distributed and Parallel Databases.

[63]  Toon Calders,et al.  Using minimum description length for process mining , 2009, SAC '09.

[64]  Boudewijn F. van Dongen,et al.  Process Mining: Overview and Outlook of Petri Net Discovery Algorithms , 2009, Trans. Petri Nets Other Model. Concurr..

[65]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2004, Distributed and Parallel Databases.

[66]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[67]  Din J. Wasem,et al.  Mining of Massive Datasets , 2014 .

[68]  Marco Montali,et al.  Runtime Verification of LTL-Based Declarative Process Models , 2011, RV.

[69]  Wil M. P. van der Aalst,et al.  Formal semantics and analysis of control flow in WS-BPEL , 2007, Sci. Comput. Program..

[70]  Rob J. van Glabbeek,et al.  Branching time and abstraction in bisimulation semantics , 1996, JACM.

[71]  Robin Milner,et al.  Communication and concurrency , 1989, PHI Series in computer science.

[72]  Bertram Ludäscher,et al.  Provenance in Scientific Workflow Systems , 2007, IEEE Data Eng. Bull..

[73]  Wil M. P. van der Aalst,et al.  A general divide and conquer approach for process mining , 2013, 2013 Federated Conference on Computer Science and Information Systems.

[74]  M. Weidlich,et al.  Behaviour Equivalence and Compatibility of Business Process Models with Complex Correspondences , 2012, Comput. J..

[75]  Philippe Darondeau,et al.  Theory of Regions , 1996, Petri Nets.

[76]  Jan Mendling,et al.  On the Degree of Behavioral Similarity between Business Process Models , 2007, EPK.

[77]  Wil M. P. van der Aalst,et al.  Discovering simulation models , 2009, Inf. Syst..

[78]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[79]  Wil M. P. van der Aalst,et al.  Decomposing Process Mining Problems Using Passages , 2012, Petri Nets.

[80]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[81]  Ricardo Seguel,et al.  Process Mining Manifesto , 2011, Business Process Management Workshops.

[82]  Wil M. P. van der Aalst Mediating between modeled and observed behavior: The quest for the “right” process: Keynote , 2013, IEEE 7th International Conference on Research Challenges in Information Science (RCIS).

[83]  Andrzej Ehrenfeucht,et al.  Partial (set) 2-structures , 1990, Acta Informatica.

[84]  Luciano Lavagno,et al.  Deriving Petri Nets for Finite Transition Systems , 1998, IEEE Trans. Computers.

[85]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[86]  Karsten Wolf,et al.  Transforming BPEL to Petri Nets , 2005, Business Process Management.

[87]  Wil M. P. van der Aalst,et al.  Discovering colored Petri nets from event logs , 2007, International Journal on Software Tools for Technology Transfer.

[88]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[89]  Wil M. P. van der Aalst,et al.  Decomposing Replay Problems: A Case Study , 2013, PNSE+ModPE.

[90]  Jana Koehler,et al.  The refined process structure tree , 2008, Data Knowl. Eng..

[91]  Alexander L. Wolf,et al.  Software process validation: quantitatively measuring the correspondence of a process to a model , 1999, TSEM.

[92]  Anindya Datta,et al.  Automating the Discovery of AS-IS Business Process Models: Probabilistic and Algorithmic Approaches , 1998, Inf. Syst. Res..

[93]  W.M.P. van der Aalst,et al.  Supporting Flexible Processes Through Log-Based Recommendations , 2008, BPM 2008.

[94]  Wil M. P. van der Aalst,et al.  Applications and Theory of Petri Nets , 1983, Informatik-Fachberichte.

[95]  Wil M. P. van der Aalst,et al.  An Experimental Evaluation of Passage-Based Process Discovery , 2012, Business Process Management Workshops.