On the Performance Analysis of the Adversarial System Variant Approximation Method to Quantify Process Model Generalization

Process mining algorithms discover a process model from an event log. The resulting process model is supposed to describe all possible event sequences of the underlying system. Generalization is a process model quality dimension of interest. A generalization metric should quantify the extent to which a process model represents the observed event sequences contained in the event log and the unobserved event sequences of the system. Most of the available metrics in the literature cannot properly quantify the generalization of a process model. A recently published method [1] called Adversarial System Variant Approximation leverages Generative Adversarial Networks to approximate the underlying event sequence distribution of a system from an event log. While this method demonstrated performance gains over existing methods in measuring the generalization of process models, its experimental evaluations have been performed under ideal conditions. This paper experimentally investigates the performance of Adversarial System Variant Approximation under non-ideal conditions such as biased and limited event logs. Moreover, experiments are performed to investigate the originally proposed sampling hyperparameter value of the method on its performance to measure the generalization. The results confirm the need to raise awareness about the working conditions of the Adversarial System Variant Approximation method. The outcomes of this paper also serve to initiate future research directions.

[1]  Daniel Amyot,et al.  Process mining in healthcare: a systematised literature review , 2016, Int. J. Electron. Heal..

[2]  Jason Yosinski,et al.  Metropolis-Hastings Generative Adversarial Networks , 2018, ICML.

[3]  Boudewijn F. van Dongen,et al.  On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery , 2012, OTM Conferences.

[4]  Boudewijn F. van Dongen,et al.  Replaying history on process models for conformance checking and performance analysis , 2012, WIREs Data Mining Knowl. Discov..

[5]  Houshang Darabi,et al.  Improving the In-Hospital Mortality Prediction of Diabetes ICU Patients Using a Process Mining/Deep Learning Architecture , 2021, IEEE Journal of Biomedical and Health Informatics.

[6]  Bart Baesens,et al.  Determining Process Model Precision and Generalization with Weighted Artificial Negative Events , 2014, IEEE Transactions on Knowledge and Data Engineering.

[7]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[8]  Seppe K. L. M. vanden Broucke,et al.  Fodina: A robust and flexible heuristic process discovery technique , 2017, Decis. Support Syst..

[9]  Houshang Darabi,et al.  Process Mining of Programmable Logic Controllers: Input/Output Event Logs , 2019, 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE).

[10]  Houshang Darabi,et al.  Adversarial System Variant Approximation to Quantify Process Model Generalization , 2020, IEEE Access.

[11]  Benoît Depaire,et al.  Towards Confirmatory Process Discovery: Making Assertions About the Underlying System , 2018, Bus. Inf. Syst. Eng..

[12]  Niek Tax,et al.  Evaluating Conformance Measures in Process Mining using Conformance Propositions (Extended version) , 2019, Trans. Petri Nets Other Model. Concurr..

[13]  Marlon Dumas,et al.  Split miner: automated discovery of accurate and simple business process models from event logs , 2019, Knowledge and Information Systems.

[14]  Houshang Darabi,et al.  Modeling and integration of hospital information systems with Petri nets , 2009, 2009 IEEE/INFORMS International Conference on Service Operations, Logistics and Informatics.

[15]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[16]  Boudewijn F. van Dongen,et al.  Business process mining: An industrial application , 2007, Inf. Syst..

[17]  Minsu Cho,et al.  A system architecture for manufacturing process analysis based on big data and process mining techniques , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[18]  Wil M. P. van der Aalst,et al.  The impact of biased sampling of event logs on the performance of process discovery , 2021, Computing.

[19]  Yingyu Liang,et al.  Generalization and Equilibrium in Generative Adversarial Nets (GANs) , 2017, ICML.

[20]  Nina Narodytska,et al.  RelGAN: Relational Generative Adversarial Networks for Text Generation , 2018, ICLR.

[21]  Benoît Depaire,et al.  A comparative study of existing quality measures for process discovery , 2017, Inf. Syst..

[22]  Jana-Rebecca Rehse,et al.  Process Mining and the Black Swan: An Empirical Analysis of the Influence of Unobserved Behavior on the Quality of Mined Process Models , 2017, Business Process Management Workshops.

[23]  Boudewijn F. van Dongen,et al.  Quality Dimensions in Process Discovery: The Importance of Fitness, Precision, Generalization and Simplicity , 2014, Int. J. Cooperative Inf. Syst..

[24]  Josep Carmona,et al.  A Unified Approach for Measuring Precision and Generalization Based on Anti-alignments , 2016, BPM.