Learning Workflow Models from Event Logs Using Co-clustering

The authors propose a co-clustering approach to extract workflow models by analyzing event logs. The authors consider two major issues that are overlooked by most of the existing process mining approaches. First, a complex system typically runs multiple workflow models, all of which share the same log system. However, current approaches mainly focus on learning a single workflow model from event logs. Second, most systems support multi-users and each user is typically associated with or use certain number of operation sequences, which may follow one or more than one workflow models. Users can thus be leveraged as an important context when learning workflow models. However, this is not considered by current approaches. Therefore, the authors propose to learn User Behavior Pattern UBP that reflects the usage pattern of a user when accessing a business process system and exploit it to discover multiple workflow models from the event log of a complex system. The authors model a UBP as a probabilistic distribution on sequences, which allows computing the similarity between UBPs and sequences. The authors then co-cluster users and sequences to generate two types of clusters: user clusters that group users sharing similar UBP, and sequence clusters that group sequences that are the instances of the same workflow models. The workflow model can then be learned by analyzing its instances. The authors conducted a comprehensive experimental study to evaluate the effectiveness and efficiency of the proposed approach.

[1]  Chris H. Q. Ding,et al.  Orthogonal nonnegative matrix t-factorizations for clustering , 2006, KDD '06.

[2]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[3]  Li Li,et al.  End-to-End Service Support for Mashups , 2010, IEEE Transactions on Services Computing.

[4]  Wil M. P. van der Aalst Process mining , 2012, CACM.

[5]  Ying Zou,et al.  An Approach for Mining Web Service Composition Patterns from Execution Logs , 2010, 2010 IEEE International Conference on Web Services.

[6]  L. Lovász Matching Theory (North-Holland mathematics studies) , 1986 .

[7]  Golan Yona,et al.  Variations on probabilistic suffix trees: statistical modeling and prediction of protein families , 2001, Bioinform..

[8]  Marielba Zacarias,et al.  Approaching Process Mining with Sequence Clustering: Experiments and Findings , 2007, BPM.

[9]  Wil M. P. van der Aalst,et al.  Process mining: a research agenda , 2004, Comput. Ind..

[10]  van der Wmp Wil Aalst,et al.  Process Mining , 2005, Process-Aware Information Systems.

[11]  Athman Bouguettaya,et al.  Efficient change management in long-term composed services , 2001, Service Oriented Computing and Applications.

[12]  Kwang-Hoon Kim,et al.  sigma - Algorithm : Structured Workflow Process Mining Through Amalgamating Temporal Workcases , 2007, PAKDD.

[13]  Luigi Pontieri,et al.  Discovering expressive process models by clustering log traces , 2006, IEEE Transactions on Knowledge and Data Engineering.

[14]  Wil M.P. van der Aalst,et al.  Genetic Process Mining , 2005, ICATPN.

[15]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[16]  Thomas Hofmann,et al.  Learning from Dyadic Data , 1998, NIPS.

[17]  Dana Ron,et al.  The power of amnesia: Learning probabilistic automata with variable memory length , 1996, Machine Learning.

[18]  Boudewijn F. van Dongen,et al.  The ProM Framework: A New Era in Process Mining Tool Support , 2005, ICATPN.