P ^3 -Folder: Optimal Model Simplification for Improving Accuracy in Process Performance Prediction

Operational process models such as generalised stochastic Petri nets (GSPNs) are useful when answering performance queries on business processes (e.g. ‘how long will it take for a case to finish?’). Recently, methods for process mining have been developed to discover and enrich operational models based on a log of recorded executions of processes, which enables evidence-based process analysis. To avoid a bias due to infrequent execution paths, discovery algorithms strive for a balance between over-fitting and under-fitting regarding the originating log. However, state-of-the-art discovery algorithms address this balance solely for the control-flow dimension, neglecting possible over-fitting in terms of performance annotations. In this work, we thus offer a technique for performance-driven model reduction of GSPNs, using structural simplification rules. Each rule induces an error in performance estimates with respect to the original model. However, we show that this error is bounded and that the reduction in model parameters incurred by the simplification rules increases the accuracy of process performance prediction. We further show how to find an optimal sequence of applying simplification rules to obtain a minimal model under a given error budget for the performance estimates. We evaluate the approach with a real-world case in the healthcare domain, showing that model simplification indeed yields significant improvements in time prediction accuracy.

[1]  Dirk Fahland,et al.  Simplifying discovered process models in a controlled manner , 2013, Inf. Syst..

[2]  Kees M. van Hee,et al.  A new reduction method for the analysis of large workflow models , 2002, Promise.

[3]  Gianfranco Balbo,et al.  Combining Queueing Networks and Generalized Stochastic Petri Nets for the Solution of Complex Models of System Behavior , 1988, IEEE Trans. Computers.

[4]  Michael Schrefl,et al.  Consistent Abstraction of Business Processes Based on Constraints , 2014, Journal on Data Semantics.

[5]  Boudewijn F. van Dongen,et al.  On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery , 2012, OTM Conferences.

[6]  Wil M. P. van der Aalst,et al.  A general process mining framework for correlating, predicting and clustering dynamic behavior based on event logs , 2016, Inf. Syst..

[7]  Boudewijn F. van Dongen,et al.  Avoiding Over-Fitting in ILP-Based Process Discovery , 2015, BPM.

[8]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[9]  P. Burke The Output of a Queuing System , 1956 .

[10]  Myron Hlynka,et al.  Queueing Networks and Markov Chains (Modeling and Performance Evaluation With Computer Science Applications) , 2007, Technometrics.

[11]  Marco Ajmone Marsan,et al.  Modelling with Generalized Stochastic Petri Nets , 1995, PERV.

[12]  Fabrizio Maria Maggi,et al.  Complex Symbolic Sequence Encodings for Predictive Monitoring of Business Processes , 2015, BPM.

[13]  Mathias Weske,et al.  Business process model abstraction: a definition, catalog, and survey , 2012, Distributed and Parallel Databases.

[14]  Jonathan Billington,et al.  New Developments in Closed-Form Computation for GSPN Aggregation , 2003, ICFEM.

[15]  Randolph W. Hall,et al.  Queueing Methods: For Services and Manufacturing , 1991 .

[16]  Jussi Vanhatalo,et al.  Simplified Computation and Generalization of the Refined Process Structure Tree , 2010, WS-FM.

[17]  Jana Koehler,et al.  The refined process structure tree , 2008, Data Knowl. Eng..

[18]  Maria Simonetta Balsamo,et al.  Composition of product-form Generalized Stochastic Petri Nets: a modular approach , 2009 .

[19]  Jan Mendling,et al.  Data-Driven Performance Analysis of Scheduled Processes , 2015, BPM.

[20]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2004, Distributed and Parallel Databases.

[21]  Wil M. P. van der Aalst,et al.  Time prediction based on process mining , 2011, Inf. Syst..

[22]  Wil M.P. van der Aalst,et al.  Fuzzy Mining - Adaptive Process Simplification Based on Multi-perspective Metrics , 2007, BPM.

[23]  W. Whitt,et al.  The Queueing Network Analyzer , 1983, The Bell System Technical Journal.

[24]  Yao Li,et al.  Performance Petri net analysis of communications protocol software by delay-equivalent aggregation , 1991, Proceedings of the Fourth International Workshop on Petri Nets and Performance Models PNPM91.

[25]  Wil M. P. van der Aalst,et al.  Discovering simulation models , 2009, Inf. Syst..

[26]  Kishor S. Trivedi,et al.  A decomposition approach for stochastic Petri net models , 1991, Proceedings of the Fourth International Workshop on Petri Nets and Performance Models PNPM91.

[27]  L. Zerguini On the Estimation of the Response Time of the Business Process , 2001 .

[28]  Matthias Weidlich,et al.  Queue Mining - Predicting Delays in Service Processes , 2014, CAiSE.

[29]  Sander J. J. Leemans,et al.  Discovering Block-Structured Process Models from Event Logs Containing Infrequent Behaviour , 2013, Business Process Management Workshops.

[30]  Geon Cho,et al.  The critical‐item, upper bounds, and a branch‐and‐bound algorithm for the tree knapsack problem , 1998 .

[31]  Wil M. P. van der Aalst,et al.  Process Mining - Discovery, Conformance and Enhancement of Business Processes , 2011 .

[32]  Mathias Weske,et al.  Prediction of Remaining Service Execution Time Using Stochastic Petri Nets with Arbitrary Firing Delays , 2013, ICSOC.

[33]  S. Resnick Adventures in stochastic processes , 1992 .

[34]  Mathias Weske,et al.  Discovering Stochastic Petri Nets with Arbitrary Delay Distributions from Event Logs , 2013, Business Process Management Workshops.

[35]  Boudewijn F. van Dongen,et al.  Process mining: a two-step approach to balance between underfitting and overfitting , 2008, Software & Systems Modeling.