Ensemble-Based Prediction of Business Processes Bottlenecks With Recurrent Concept Drifts

Bottleneck prediction is an important sub-task of process mining that aims at optimizing the discovered process models by avoiding such congestions. This paper discusses an ongoing work on incorporating recurrent concept drift in bottleneck prediction when applied to a real-world scenario. In the field of process mining, we develop a method of predicting whether and which bottlenecks will likely appear based on data known before a case starts. We next introduce GRAEC, a carefully-designed weighting mechanism to deal with concept drifts. The weighting decays over time and is extendable to adapt to seasonality in data. The methods are then applied to a simulation, and an invoicing process in the field of installation services in real-world settings. The results show an improvement to prediction accuracy compared to retraining a model on the most recent data.

[1]  Emmanuel Müller,et al.  Detecting Change Processes in Dynamic Networks by Frequent Graph Evolution Rule Mining , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[2]  Marwan Hassani,et al.  Efficient clustering of big data streams , 2015 .

[3]  Thomas Seidl,et al.  BT* - An Advanced Algorithm for Anytime Classification , 2012, SSDBM.

[4]  Thomas Seidl,et al.  Incremental Temporal Pattern Mining Using Efficient Batch-Free Stream Clustering , 2017, SSDBM.

[5]  Marwan Hassani,et al.  Online conformance checking: relating event streams to process models using prefix-alignments , 2017, International Journal of Data Science and Analytics.

[6]  João Gama,et al.  Recurrent concepts in data streams classification , 2013, Knowledge and Information Systems.

[7]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[8]  Raj Bhatnagar,et al.  Tracking recurrent concept drift in streaming data using ensemble classifiers , 2007, Sixth International Conference on Machine Learning and Applications (ICMLA 2007).

[9]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[10]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[11]  Matthias Weidlich,et al.  Handling Concept Drift in Predictive Process Monitoring , 2017, 2017 IEEE International Conference on Services Computing (SCC).

[12]  Geoff Holmes,et al.  New ensemble methods for evolving data streams , 2009, KDD.

[13]  Marwan Hassani Overview of efficient clustering methods for high-dimensional big data streams , 2019 .

[14]  Russel Pears,et al.  Use of ensembles of Fourier spectra in capturing recurrent concepts in data streams , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[15]  Ernestina Menasalvas Ruiz,et al.  Mining Recurring Concepts in a Dynamic Feature Space , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Thomas Seidl,et al.  Efficient Process Discovery From Event Streams Using Sequential Pattern Mining , 2015, 2015 IEEE Symposium Series on Computational Intelligence.

[17]  Matthias Weidlich,et al.  Queue Mining - Predicting Delays in Service Processes , 2014, CAiSE.

[18]  Roberto Souto Maior de Barros,et al.  RCD: A recurring concept drift framework , 2013, Pattern Recognit. Lett..

[19]  Avrim Blum,et al.  Empirical Support for Winnow and Weighted-Majority Algorithms: Results on a Calendar Scheduling Domain , 2004, Machine Learning.

[20]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.