Discovering more precise process models from event logs by filtering out chaotic activities

Process Discovery is concerned with the automatic generation of a process model that describes a business process from execution data of that business process. Real life event logs can contain chaotic activities. These activities are independent of the state of the process and can, therefore, happen at rather arbitrary points in time. We show that the presence of such chaotic activities in an event log heavily impacts the quality of the process models that can be discovered with process discovery techniques. The current modus operandi for filtering activities from event logs is to simply filter out infrequent activities. We show that frequency-based filtering of activities does not solve the problems that are caused by chaotic activities. Moreover, we propose a novel technique to filter out chaotic activities from event logs. We evaluate this technique on a collection of seventeen real-life event logs that originate from both the business process management domain and the smart home environment domain. As demonstrated, the developed activity filtering methods enable the discovery of process models that are more behaviorally specific compared to process models that are discovered using standard frequency-based filtering.

[1]  A. H. M. T. Hofstedea,et al.  Event log imperfection patterns for process mining : Towards a systematic approach to cleaning event logs , 2016 .

[2]  Sander J. J. Leemans,et al.  Discovering Block-Structured Process Models from Event Logs Containing Infrequent Behaviour , 2013, Business Process Management Workshops.

[3]  Sander J. J. Leemans,et al.  Discovering Block-Structured Process Models from Event Logs - A Constructive Approach , 2013, Petri Nets.

[4]  Joachim Herbst,et al.  A Machine Learning Approach to Workflow Management , 2000, ECML.

[5]  Josep Carmona,et al.  Region-Based Foldings in Process Discovery , 2013, IEEE Transactions on Knowledge and Data Engineering.

[6]  Luigi Pontieri,et al.  Outlier Detection Techniques for Process Mining Applications , 2008, ISMIS.

[7]  Massimo Mecella,et al.  Process-Based Habit Mining: Experiments and Techniques , 2016, 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld).

[8]  Arthur H. M. ter Hofstede,et al.  Filtering Out Infrequent Behavior from Business Process Event Logs , 2017, IEEE Transactions on Knowledge and Data Engineering.

[9]  Boudewijn F. van Dongen,et al.  Process Discovery using Integer Linear Programming , 2009, Fundam. Informaticae.

[10]  Tullio Vernazza,et al.  Analysis of human behavior recognition algorithms based on acceleration data , 2013, 2013 IEEE International Conference on Robotics and Automation.

[11]  Kent Larson,et al.  Activity Recognition in the Home Using Simple and Ubiquitous Sensors , 2004, Pervasive.

[12]  Johanna Völker,et al.  Discovery of Personal Processes from Labeled Sensor Data - An Application of Process Mining to Personalized Health Care , 2015, ATAED@Petri Nets/ACSD.

[13]  Wil M. P. van der Aalst,et al.  RapidProM: Mine Your Processes and Not Just Your Data , 2017, ArXiv.

[14]  Wil M. P. van der Aalst,et al.  Mining Process Model Descriptions of Daily Life through Event Abstraction , 2016, ArXiv.

[15]  Markus Hofmann,et al.  RapidMiner: Data Mining Use Cases and Business Analytics Applications , 2013 .

[16]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[17]  C. Humby,et al.  Process Mining: Data science in Action , 2014 .

[18]  Tao Qin,et al.  LETOR: A benchmark collection for research on learning to rank for information retrieval , 2010, Information Retrieval.

[19]  Wil M. P. van der Aalst,et al.  A Rule-Based Approach for Process Discovery: Dealing with Noise and Imbalance in Process Logs , 2005, Data Mining and Knowledge Discovery.

[20]  Remco M. Dijkman,et al.  Petri Net Transformations for Business Processes - A Survey , 2009, Trans. Petri Nets Other Model. Concurr..

[21]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[22]  Tadao Murata,et al.  Petri nets: Properties, analysis and applications , 1989, Proc. IEEE.

[23]  Moe Thandar Wynn,et al.  Event log imperfection patterns for process mining: Towards a systematic approach to cleaning event logs , 2017, Inf. Syst..

[24]  Wil M. P. van der Aalst,et al.  Fuzzy Mining - Adaptive Process Simplification Based on Multi-perspective Metrics , 2007, BPM.

[25]  Araceli Sanchis,et al.  Activity Recognition Using Hybrid Generative/Discriminative Models on Home Environments Using Binary Sensors , 2013, Sensors.

[26]  Bart Baesens,et al.  Determining Process Model Precision and Generalization with Weighted Artificial Negative Events , 2014, IEEE Transactions on Knowledge and Data Engineering.

[27]  Wil M. P. van der Aalst,et al.  Event Abstraction for Process Mining using Supervised Learning Techniques , 2016, IntelliSys.

[28]  Dirk Fahland,et al.  Detecting Deviating Behaviors Without Models , 2015, Business Process Management Workshops.

[29]  THOMAS MCCURDY,et al.  The National Exposure Research Laboratory's Consolidated Human Activity Database* , 2000, Journal of Exposure Analysis and Environmental Epidemiology.

[30]  Stijn van Dongen,et al.  Graph Clustering Via a Discrete Uncoupling Process , 2008, SIAM J. Matrix Anal. Appl..

[31]  Djoerd Hiemstra,et al.  A cross-benchmark comparison of 87 learning to rank methods , 2015, Inf. Process. Manag..

[32]  Bart Baesens,et al.  A robust F-measure for evaluating discovered process models , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[33]  Wil M. P. van der Aalst,et al.  Improving Process Discovery Results by Filtering Outliers Using Conditional Behavioural Probabilities , 2017, Business Process Management Workshops.

[34]  Akhil Kumar,et al.  Process mining on noisy logs - Can log sanitization help to improve performance? , 2015, Decis. Support Syst..

[35]  Dirk Fahland,et al.  Conformance checking in healthcare based on partially ordered event data , 2014, Proceedings of the 2014 IEEE Emerging Technology and Factory Automation (ETFA).

[36]  Boudewijn F. van Dongen,et al.  Avoiding Over-Fitting in ILP-Based Process Discovery , 2015, BPM.

[37]  Wil M. P. van der Aalst,et al.  Mining local process models , 2016, J. Innov. Digit. Ecosyst..

[38]  Dirk Fahland,et al.  Conformance Checking Based on Partially Ordered Event Data , 2014, Business Process Management Workshops.

[39]  Sander J. J. Leemans,et al.  Process and Deviation Exploration with Inductive Visual Miner , 2014, BPM.

[40]  Bart Baesens,et al.  Robust Process Discovery with Artificial Negative Events , 2009, J. Mach. Learn. Res..

[41]  Jadzia Cendrowska,et al.  PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[42]  Seppe K. L. M. vanden Broucke,et al.  Fodina: A robust and flexible heuristic process discovery technique , 2017, Decis. Support Syst..

[43]  Felix Mannhardt,et al.  Sepsis Cases - Event Log , 2016 .

[44]  Jan Mendling,et al.  Applying Process Mining to Smart Spaces: Perspectives and Research Challenges , 2015, CAiSE Workshops.

[45]  Boudewijn F. van Dongen,et al.  The ProM Framework: A New Era in Process Mining Tool Support , 2005, ICATPN.

[46]  Diane J. Cook,et al.  CASAS: A Smart Home in a Box , 2013, Computer.

[47]  Boudewijn F. van Dongen,et al.  Conformance Checking Using Cost-Based Fitness Analysis , 2011, 2011 IEEE 15th International Enterprise Distributed Object Computing Conference.

[48]  Boudewijn F. van Dongen,et al.  A genetic algorithm for discovering process trees , 2012, 2012 IEEE Congress on Evolutionary Computation.

[49]  A. J. M. M. Weijters,et al.  Flexible Heuristics Miner (FHM) , 2011, 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM).

[50]  Gwenn Englebienne,et al.  Accurate activity recognition in a home setting , 2008, UbiComp.