Decomposed and parallel process discovery: A framework and application

Abstract The rapid growth of event data motivates the developing of decomposed and parallel process discovery, which solves process discovery from a large event log by decomposing the log into multiple small event logs, discovering process models from these small logs in parallel, and merging discovered process models. As such, process discovery from a large event log can be solved in less time. Currently, the passage and maximal decomposition based techniques are successful in the activity partition by decomposing causal graph structure derived from a large event log. In this paper, we propose a five-step framework, based on which we can build various decomposed and parallel process discovery techniques by simply combining and adapting existing techniques. Then, we propose a technique, RPSTHD, based on the framework using the refined process structure tree (RPST), heuristic miner, process mining using integer linear programming (ILP), etc. An experimental evaluation shows that our technique significantly outperforms the state-of-the-art decomposed discovery techniques in both efficiency and effectiveness.

[1]  Sira Yongchareon,et al.  Efficient Process Model Discovery Using Maximal Pattern Mining , 2015, BPM.

[2]  Jianmin Wang,et al.  A novel approach for process mining based on event types , 2007, IEEE International Conference on Services Computing (SCC 2007).

[3]  Marlon Dumas,et al.  Slice, Mine and Dice: Complexity-Aware Automated Discovery of Business Process Models , 2013, BPM.

[4]  Joerg Evermann,et al.  Big data meets process mining: implementing the alpha algorithm with map-reduce , 2014, SAC.

[5]  Mathias Weske,et al.  Efficient Consistency Measurement Based on Behavioral Profiles of Process Models , 2011, IEEE Transactions on Software Engineering.

[6]  Jan Mendling,et al.  Metrics for Process Models: Empirical Foundations of Verification, Error Prediction, and Guidelines for Correctness , 2008, Lecture Notes in Business Information Processing.

[7]  Massimo Mecella,et al.  Automated Discovery of Process Models from Event Logs: Review and Benchmark , 2017, IEEE Transactions on Knowledge and Data Engineering.

[8]  Wil M. P. van der Aalst,et al.  Decomposing Process Mining Problems Using Passages , 2012, Petri Nets.

[9]  Wil M. P. van der Aalst,et al.  Genetic Process Mining: A Basic Approach and Its Challenges , 2005, Business Process Management Workshops.

[10]  Ricardo Seguel,et al.  Process Mining Manifesto , 2011, Business Process Management Workshops.

[11]  Wil M. P. van der Aalst,et al.  Conformance Checking in the Large: Partitioning and Topology , 2013, BPM.

[12]  Sander J. J. Leemans,et al.  Discovering Block-Structured Process Models from Incomplete Event Logs , 2014, Petri Nets.

[13]  Tadao Murata,et al.  Petri nets: Properties, analysis and applications , 1989, Proc. IEEE.

[14]  Wil M. P. van der Aalst,et al.  A general divide and conquer approach for process mining , 2013, 2013 Federated Conference on Computer Science and Information Systems.

[15]  Wil M. P. van der Aalst,et al.  An Experimental Evaluation of Passage-Based Process Discovery , 2012, Business Process Management Workshops.

[16]  Wil M. P. van der Aalst,et al.  Conformance checking of processes based on monitoring real behavior , 2008, Inf. Syst..

[17]  Jianmin Wang,et al.  Mining process models with non-free-choice constructs , 2007, Data Mining and Knowledge Discovery.

[18]  Josep Carmona,et al.  A Region-Based Algorithm for Discovering Petri Nets from Event Logs , 2008, BPM.

[19]  Wil M. P. van der Aalst,et al.  Decomposed Process Mining: The ILP Case , 2014, Business Process Management Workshops.

[20]  Boudewijn F. van Dongen,et al.  Quality Dimensions in Process Discovery: The Importance of Fitness, Precision, Generalization and Simplicity , 2014, Int. J. Cooperative Inf. Syst..

[21]  Boudewijn F. van Dongen,et al.  Alignment Based Precision Checking , 2012, Business Process Management Workshops.

[22]  Boudewijn F. van Dongen,et al.  Conformance Checking Using Cost-Based Fitness Analysis , 2011, 2011 IEEE 15th International Enterprise Distributed Object Computing Conference.

[23]  Boualem Benatallah,et al.  Using Mapreduce to Scale Events Correlation Discovery for Business Processes Mining , 2012, BPM.

[24]  Robin Bergenthum,et al.  Process Mining Based on Regions of Languages , 2007, BPM.

[25]  A. J. M. M. Weijters,et al.  Flexible Heuristics Miner (FHM) , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[26]  Jianmin Wang,et al.  Mining process models with prime invisible tasks , 2010, Data Knowl. Eng..

[27]  Philip S. Yu,et al.  Mining Invisible Tasks in Non-free-choice Constructs , 2015, BPM.

[28]  Jussi Vanhatalo,et al.  Simplified Computation and Generalization of the Refined Process Structure Tree , 2010, WS-FM.

[29]  Wil M. P. van der Aalst,et al.  Hierarchical Conformance Checking of Process Models Based on Event Logs , 2013, Petri Nets.

[30]  Wil M. P. van der Aalst,et al.  Decomposing Petri nets for process mining: A generic approach , 2013, Distributed and Parallel Databases.

[31]  Boudewijn F. van Dongen,et al.  Process Discovery using Integer Linear Programming , 2009, Fundam. Informaticae.

[32]  Marlon Dumas,et al.  Split Miner: Discovering Accurate and Simple Business Process Models from Event Logs , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[33]  Wil M. P. van der Aalst,et al.  Finding Suitable Activity Clusters for Decomposed Process Discovery , 2014, SIMPDA.

[34]  Josep Carmona,et al.  Divide-and-Conquer Strategies for Process Mining , 2009, BPM.

[35]  Ashish Sureka,et al.  Vidushi: Parallel Implementation of Alpha Miner Algorithm and Performance Analysis on CPU and GPU Architecture , 2015, Business Process Management Workshops.

[36]  Jianmin Wang,et al.  Mining Invisible Tasks from Event Logs , 2007, APWeb/WAIM.

[37]  Wil M. P. van der Aalst,et al.  Process Mining - Discovery, Conformance and Enhancement of Business Processes , 2011 .

[38]  Jana Koehler,et al.  The refined process structure tree , 2009, Data Knowl. Eng..

[39]  Wil M. P. van der Aalst,et al.  Distributed Process Discovery and Conformance Checking , 2012, FASE.

[40]  Joerg Evermann,et al.  Scalable Process Discovery Using Map-Reduce , 2016, IEEE Transactions on Services Computing.

[41]  Wil M. P. van der Aalst,et al.  Distributed genetic process mining , 2010, IEEE Congress on Evolutionary Computation.

[42]  Wil M. P. van der Aalst,et al.  Divide and Conquer: A Tool Framework for Supporting Decomposed Discovery in Process Mining , 2017, Comput. J..

[43]  Bart Baesens,et al.  A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs , 2012, Inf. Syst..

[44]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.