An Anti-Noise Process Mining Algorithm Based on Minimum Spanning Tree Clustering

Many human-centric systems have begun to use business process management technology in production. With the operation of business process management systems, more and more business process logs and human-centric data have been accumulated. However, the effective utilization and analysis of these event logs are challenges that people need to solve urgently. Process mining technology is a branch of business process management technology. It can extract process knowledge from event logs and build process models, which helps to detect and improve business processes. The current process mining algorithms are inadequate in dealing with log noise. The family of alpha-algorithms ignores the impact of noise, which is unrealistic in real-life logs. Most of the process mining algorithms that can handle noise also lack reasonable denoising thresholds. In this paper, a new assumption on noise is given. Furthermore, an anti-noise process mining algorithm that can deal with noise is proposed. The decision rules of the selective, parallel, and non-free choice structures are also given. The proposed algorithm framework discovers the process model and transforms it into a Petri network representation. We calculate the distance between traces to build the minimum spanning tree on which clusters are generated. The traces of the non-largest clusters are treated as noise, and the largest cluster is mined. Finally, the algorithm can discover the regular routing structure and solve the problem of noise. The experimental results show the correctness of the algorithm when compared with the $\alpha$ ++ algorithm.

[1]  Andrew K. C. Wong,et al.  A new method for gray-level picture thresholding using the entropy of the histogram , 1985, Comput. Vis. Graph. Image Process..

[2]  Boudewijn F. van Dongen,et al.  Workflow mining: A survey of issues and approaches , 2003, Data Knowl. Eng..

[3]  Wil M. P. van der Aalst,et al.  Trace Alignment in Process Mining: Opportunities for Process Diagnostics , 2010, BPM.

[4]  Wil M. P. van der Aalst,et al.  Genetic process mining: an experimental evaluation , 2007, Data Mining and Knowledge Discovery.

[5]  Boudewijn F. van Dongen,et al.  A genetic algorithm for discovering process trees , 2012, 2012 IEEE Congress on Evolutionary Computation.

[6]  Dimitris Karagiannis,et al.  Integrating machine learning and workflow management to support acquisition and adaptation of workflow models , 2000 .

[7]  Wil M. P. van der Aalst,et al.  Process Mining - Discovery, Conformance and Enhancement of Business Processes , 2011 .

[8]  Dimitris Karagiannis,et al.  Workflow mining with InWoLvE , 2004, Comput. Ind..

[9]  Markus Hammori,et al.  Interactive workflow mining - requirements, concepts and implementation , 2006, Data Knowl. Eng..

[10]  Wil M. P. van der Aalst,et al.  Workflow Mining: Current Status and Future Directions , 2003, OTM.

[11]  Alexander L. Wolf,et al.  Event-Based Detection of Concurrency , 2006 .

[12]  Wil M.P. van der Aalst,et al.  Process mining with the HeuristicsMiner algorithm , 2006 .

[13]  Sira Yongchareon,et al.  Efficient Process Model Discovery Using Maximal Pattern Mining , 2015, BPM.

[14]  LiGuo Huang,et al.  Discovering process models from event multiset , 2012, Expert Syst. Appl..

[15]  A. J. M. M. Weijters,et al.  Flexible Heuristics Miner (FHM) , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[16]  Mykola Pechenizkiy,et al.  Handling Concept Drift in Process Mining , 2011, CAiSE.

[17]  Dimitrios Gunopulos,et al.  Mining Process Models from Workflow Logs , 1998, EDBT.

[18]  Shlomit S. Pinter,et al.  Generating a Process Model from a Process Audit Log , 2003, Business Process Management.

[19]  Wil M.P. van der Aalst,et al.  Process Mining : Extending the α-algorithm to Mine Short Loops , 2004 .

[20]  Jianmin Wang,et al.  Mining process models with non-free-choice constructs , 2007, Data Mining and Knowledge Discovery.

[21]  Jianmin Wang,et al.  Mining process models with prime invisible tasks , 2010, Data Knowl. Eng..

[22]  Wil M.P. van der Aalst,et al.  Process mining: discovering workflow models from event-based data , 2001 .

[23]  Wil M. P. van der Aalst,et al.  Conformance checking of processes based on monitoring real behavior , 2008, Inf. Syst..

[24]  Philip S. Yu,et al.  Mining Invisible Tasks in Non-free-choice Constructs , 2015, BPM.

[25]  Manuel Mucientes,et al.  ProDiGen: Mining complete, precise and minimal structure process models with a genetic algorithm , 2015, Inf. Sci..

[26]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.