Clustering-Based Predictive Process Monitoring

The enactment of business processes is generally supported by information systems that record data about each process execution (a.k.a. case). This data can be analyzed via a family of methods broadly known as process mining. Predictive process monitoring is a process mining technique concerned with predicting how running (uncompleted) cases will unfold up to their completion. In this paper, we propose a predictive process monitoring framework for estimating the probability that a given predicate will be fulfilled upon completion of a running case. The framework takes into account both the sequence of events observed in the current trace, as well as data attributes associated to these events. The prediction problem is approached in two phases. First, prefixes of previous (completed) cases are clustered according to control flow information. Second, a classifier is built for each cluster using event data attributes to discriminate between cases that lead to a fulfillment of the predicate under examination and cases that lead to a violation within the cluster. At runtime, a prediction is made on a running case by mapping it to a cluster and applying the corresponding classifier. The framework has been implemented in the ProM toolset and validated on a log pertaining to the treatment of cancer patients in a large hospital.

[1]  Mathias Weske,et al.  Prediction of Remaining Service Execution Time Using Stochastic Petri Nets with Arbitrary Firing Delays , 2013, ICSOC.

[2]  Anatoly G Artemenko,et al.  Interpretation of QSAR Models Based on Random Forest Methods , 2011, Molecular informatics.

[3]  Wil M. P. van der Aalst,et al.  Time prediction based on process mining , 2011, Inf. Syst..

[4]  Frank Leymann,et al.  Monitoring and Analyzing Influential Factors of Business Process Performance , 2009, 2009 IEEE International Enterprise Distributed Object Computing Conference.

[5]  Andreas Metzger,et al.  Proactive event processing in action: a case study on the proactive management of transport processes (industry article) , 2013, DEBS '13.

[6]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[7]  Fabio Casati,et al.  Predictive business operations management , 2005, Int. J. Comput. Sci. Eng..

[8]  Moe Thandar Wynn,et al.  Predicting Deadline Transgressions Using Event Logs , 2012, Business Process Management Workshops.

[9]  Wil M. P. van der Aalst,et al.  Beyond Process Mining: From the Past to Present and Future , 2010, CAiSE.

[10]  Jian Pei,et al.  A brief survey on sequence classification , 2010, SKDD.

[11]  Claes Wohlin,et al.  Experimentation in software engineering: an introduction , 2000 .

[12]  Fabrizio Maria Maggi,et al.  Predictive Monitoring of Business Processes , 2013, CAiSE.

[13]  Siau-Cheng Khoo,et al.  SMArTIC: towards building an accurate, robust and scalable specification miner , 2006, SIGSOFT '06/FSE-14.

[14]  Fabrizio Maria Maggi,et al.  Designing software for operational decision support through coloured Petri nets , 2017, Enterp. Inf. Syst..

[15]  Fabrizio Maria Maggi,et al.  Complex Symbolic Sequence Clustering and Multiple Classifiers for Predictive Process Monitoring , 2016, Business Process Management Workshops.

[16]  Adrian E. Raftery,et al.  Enhanced Model-Based Clustering, Density Estimation, and Discriminant Analysis Software: MCLUST , 2003, J. Classif..

[17]  Francesco Folino,et al.  Discovering Context-Aware Models for Predicting Business Process Performances , 2012, OTM Conferences.

[18]  Klaus Pohl,et al.  Comparing and Combining Predictive Business Process Monitoring Techniques , 2015, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[19]  G. Loukidis,et al.  SIAM International Conference on Data Mining (SDM) , 2015 .

[20]  Andreas Metzger,et al.  Predictive Monitoring of Heterogeneous Service-Oriented Business Networks: The Transport and Logistics Case , 2012, 2012 Annual SRII Global Conference.

[21]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[22]  Philip S. Yu,et al.  Mining Sequence Classifiers for Early Prediction , 2008, SDM.

[23]  Wil M. P. van der Aalst,et al.  Supporting Risk-Informed Decisions during Business Process Execution , 2013, CAiSE.

[24]  Jiawei Han,et al.  Classification of software behaviors for failure detection: a discriminative pattern mining approach , 2009, KDD.

[25]  Bokyoung Kang,et al.  Real-time business process monitoring using formal concept analysis , 2011, Ind. Manag. Data Syst..

[26]  Wil M. P. van der Aalst,et al.  Root Cause Analysis with Enriched Process Logs , 2012, Business Process Management Workshops.

[27]  Bokyoung Kang,et al.  Real-time business process monitoring method for prediction of abnormal termination using KNNI-based LOF prediction , 2012, Expert Syst. Appl..

[28]  Hong Cheng,et al.  Mining closed discriminative dyadic sequential patterns , 2011, EDBT/ICDT '11.

[29]  Daniel Neagu,et al.  Interpreting random forest models using a feature contribution method , 2013, 2013 IEEE 14th International Conference on Information Reuse & Integration (IRI).

[30]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[31]  Grigore Rosu,et al.  Testing Linear Temporal Logic Formulae on Finite Execution Traces , 2001 .

[32]  Fabrizio Maria Maggi,et al.  Modeling and Verification of a Protocol for Operational Support Using Coloured Petri Nets , 2011, Petri Nets.

[33]  Miroslaw Malek,et al.  A survey of online failure prediction methods , 2010, CSUR.

[34]  Fabrizio Maria Maggi,et al.  Complex Symbolic Sequence Encodings for Predictive Monitoring of Business Processes , 2015, BPM.

[35]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.