Wanna improve process mining results?

The growing interest in process mining is fueled by the increasing availability of event data. Process mining techniques use event logs to automatically discover process models, check conformance, identify bottlenecks and deviations, suggest improvements, and predict processing times. Lion's share of process mining research has been devoted to analysis techniques. However, the proper handling of problems and challenges arising in analyzing event logs used as input is critical for the success of any process mining effort. In this paper, we identify four categories of process characteristics issues that may manifest in an event log (e.g. process problems related to event granularity and case heterogeneity) and 27 classes of event log quality issues (e.g., problems related to timestamps in event logs, imprecise activity names, and missing events). The systematic identification and analysis of these issues calls for a consolidated effort from the process mining community. Five real-life event logs are analyzed to illustrate the omnipresence of process and event log issues. We hope that these findings will encourage systematic logging approaches (to prevent event log issues), repair techniques (to alleviate event log issues) and analysis techniques (to deal with the manifestation of process characteristics in event logs).

[1]  Wil M. P. van der Aalst,et al.  Data-aware process mining: discovering decisions in processes using alignments , 2013, SAC '13.

[2]  Wil M. P. van der Aalst,et al.  Abstractions in Process Mining: A Taxonomy of Patterns , 2009, BPM.

[3]  Gianfranco Ciardo,et al.  Applications and Theory of Petri Nets 2005, 26th International Conference, ICATPN 2005, Miami, USA, June 20-25, 2005, Proceedings , 2005, ICATPN.

[4]  Wil M. P. van der Aalst,et al.  Analysis of Patient Treatment Procedures , 2011, Business Process Management Workshops.

[5]  Mykola Pechenizkiy,et al.  Handling Concept Drift in Process Mining , 2011, CAiSE.

[6]  Rs Ronny Mans,et al.  Workflow support for the healthcare domain , 2011 .

[7]  Luigi Pontieri,et al.  Discovering expressive process models by clustering log traces , 2006, IEEE Transactions on Knowledge and Data Engineering.

[8]  Guido Governatori,et al.  Compliance aware business process design , 2008 .

[9]  Boudewijn F. van Dongen,et al.  Process Mining Based on Clustering: A Quest for Precision , 2007, Business Process Management Workshops.

[10]  Heiko Mueller,et al.  Problems , Methods , and Challenges in Comprehensive Data Cleansing , 2005 .

[11]  Shaowen Qin,et al.  Gaining insight from patient journey data using a process-oriented analysis approach , 2012 .

[12]  Erhard Rahm,et al.  Data Cleaning: Problems and Current Approaches , 2000, IEEE Data Eng. Bull..

[13]  N. R. T. P. van Beest,et al.  Redesigning business processes: a methodology based on simulation and process mining techniques , 2009, Knowledge and Information Systems.

[14]  Shusaku Tsumoto,et al.  Foundations of Intelligent Systems, 15th International Symposium, ISMIS 2005, Saratoga Springs, NY, USA, May 25-28, 2005, Proceedings , 2005, ISMIS.

[15]  Wil M. P. van der Aalst,et al.  Discovering Hierarchical Process Models Using ProM , 2011, CAiSE Forum.

[16]  Silvia Miksch,et al.  A Taxonomy of Dirty Time-Oriented Data , 2012, CD-ARES.

[17]  Wil M. P. van der Aalst,et al.  Context Aware Trace Clustering: Towards Improving Process Mining Results , 2009, SDM.

[18]  J. Manyika Big data: The next frontier for innovation, competition, and productivity , 2011 .

[19]  Diogo R. Ferreira,et al.  Business process analysis in healthcare environments: A methodology based on process mining , 2012, Inf. Syst..

[20]  Ricardo Seguel,et al.  Process Mining Manifesto , 2011, Business Process Management Workshops.

[21]  Wil M. P. van der Aalst,et al.  Semantic Process Mining Tools: Core Building Blocks , 2008, ECIS.

[22]  Wil M. P. van der Aalst,et al.  Activity Mining by Global Trace Segmentation , 2009, Business Process Management Workshops.

[23]  Wil M. P. van der Aalst,et al.  Trace Clustering in Process Mining , 2008, Business Process Management Workshops.

[24]  Francesco Folino,et al.  Mining usage scenarios in business processes: Outlier-aware discovery and run-time prediction , 2011, Data Knowl. Eng..

[25]  Andreas Kerren,et al.  Human-Centered Visualization Environments , 2008 .

[26]  van der Wmp Wil Aalst,et al.  Wanna improve process mining results? : it’s high time we consider data quality issues seriously , 2013 .

[27]  Josep Carmona,et al.  Online Techniques for Dealing with Concept Drift in Process Mining , 2012, IDA.

[28]  José Barateiro,et al.  A Survey of Data Quality Tools , 2005, Datenbank-Spektrum.

[29]  Wil M. P. van der Aalst,et al.  Process Mining - Discovery, Conformance and Enhancement of Business Processes , 2011 .

[30]  Wil M. P. van der Aalst,et al.  Process Flexibility: A Survey of Contemporary Approaches , 2008, CIAO! / EOMAS.

[31]  R. P. Jagadeesh Chandra Bose,et al.  Analysis of patient treatment procedures: The BPI Challenge case study , 2011 .

[32]  Luigi Pontieri,et al.  Outlier Detection Techniques for Process Mining Applications , 2008, ISMIS.

[33]  R. P. Jagadeesh Chandra Bose,et al.  Process mining in the large : preprocessing, discovery, and diagnostics , 2012 .

[34]  Nassir Navab,et al.  Workflow mining for visualization and analysis of surgeries , 2008, International Journal of Computer Assisted Radiology and Surgery.

[35]  Marlon Dumas,et al.  Discovering Branching Conditions from Business Process Execution Logs , 2013, FASE.

[36]  Bart Baesens,et al.  Leveraging process discovery with trace clustering and text mining for intelligent analysis of incident management processes , 2012, 2012 IEEE Congress on Evolutionary Computation.

[37]  John Domingue,et al.  Towards an Ontology for Process Monitoring and Mining , 2007, SBPM.

[38]  Daniela Luengo,et al.  Applying Clustering in Process Mining to Find Different Versions of a Business Process That Changes over Time , 2011, Business Process Management Workshops.

[39]  Mathias Pohl,et al.  Visual Representations , 2006, Human-Centered Visualization Environments.

[40]  Wil M.P. van der Aalst,et al.  Fuzzy Mining - Adaptive Process Simplification Based on Multi-perspective Metrics , 2007, BPM.

[41]  Wil M. P. van der Aalst,et al.  Towards comprehensive support for organizational mining , 2008, Decis. Support Syst..

[42]  Jonas Poelmans,et al.  Combining Business Process and Data Discovery Techniques for Analyzing and Improving Integrated Care Pathways , 2010, ICDM.

[43]  Wil M. P. van der Aalst,et al.  Process Mining Applied to the BPI Challenge 2012: Divide and Conquer While Discerning Resources , 2012, Business Process Management Workshops.

[44]  Thomas Stocker,et al.  Time-Based Trace Clustering for Evolution-Aware Security Audits , 2011, Business Process Management Workshops.

[45]  Hans-Ulrich Prokosch,et al.  Process Mining for Clinical Workflows: Challenges and Current Limitations , 2008, MIE.

[46]  Boudewijn F. van Dongen,et al.  A genetic algorithm for discovering process trees , 2012, 2012 IEEE Congress on Evolutionary Computation.

[47]  Wil M. P. van der Aalst,et al.  Decision Mining in ProM , 2006, Business Process Management.

[48]  Wil M.P. van der Aalst,et al.  Genetic Process Mining , 2005, ICATPN.

[49]  Doheon Lee,et al.  A Taxonomy of Dirty Data , 2004, Data Mining and Knowledge Discovery.

[50]  Wil M. P. van der Aalst,et al.  Trace Clustering Based on Conserved Patterns: Towards Achieving Better Process Models , 2009, Business Process Management Workshops.

[51]  Wil M. P. van der Aalst,et al.  Application of Process Mining in Healthcare - A Case Study in a Dutch Hospital , 2008, BIOSTEC.

[52]  Pedro Rangel Henriques,et al.  A Formal Definition of Data Quality Problems , 2005, ICIQ.

[53]  Jan L. G. Dietz,et al.  Advances in Enterprise Engineering VIII , 2014, Lecture Notes in Business Information Processing.