Mining Expressive Process Models by Clustering Workflow Traces

We propose a general framework for the process mining problem which encompasses the assumption of workflow schema with local constraints only, for it being applicable to more expressive specification languages, independently of the particular syntax adopted. In fact, we provide an effective technique for process mining based on the rather unexplored concept of clustering workflow executions, in which clusters of executions sharing the same structure and the same unexpected behavior (w.r.t. the local properties) are seen as a witness of the existence of global constraints.

[1]  Domenico Saccà,et al.  Mining Frequent Instances on Workflows , 2003, PAKDD.

[2]  Alexander L. Wolf,et al.  Software process validation: quantitatively measuring the correspondence of a process to a model , 1999, TSEM.

[3]  Isidro Ramos,et al.  Advances in Database Technology — EDBT'98 , 1998, Lecture Notes in Computer Science.

[4]  Amit P. Sheth,et al.  An overview of workflow management: From process modeling to workflow automation infrastructure , 1995, Distributed and Parallel Databases.

[5]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[6]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[7]  Wil M. P. van der Aalst,et al.  The Application of Petri Nets to Workflow Management , 1998, J. Circuits Syst. Comput..

[8]  Mohammed J. Zaki,et al.  Mining features for sequence classification , 1999, KDD '99.

[9]  Graeme Hirst,et al.  Lexical chains as representations of context for the detection and correction of malapropisms , 1995 .

[10]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[11]  Dimitrios Gunopulos,et al.  Mining Process Models from Workflow Logs , 1998, EDBT.

[12]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[13]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[14]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[15]  Filippo Menczer,et al.  Feature selection in unsupervised learning via evolutionary search , 2000, KDD '00.

[16]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[17]  C. R. Ramakrishnan,et al.  Logic based modeling and analysis of workflows , 1998, PODS '98.

[18]  Wil M. P. van der Aalst,et al.  An Alternative Way to Analyze Workflow Graphs , 2002, CAiSE.

[19]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[20]  Gerhard Weikum,et al.  The Mentor project: steps towards enterprise-wide workflow management , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[21]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[22]  Kees M. van Hee,et al.  Workflow Management: Models, Methods, and Systems , 2002, Cooperative information systems.

[23]  Boudewijn F. van Dongen,et al.  Workflow mining: A survey of issues and approaches , 2003, Data Knowl. Eng..

[24]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[25]  Hiroshi Motoda,et al.  Data reduction: feature selection , 2002 .

[26]  Balaji Padmanabhan,et al.  Small is beautiful: discovering the minimal set of unexpected patterns , 2000, KDD '00.

[27]  Wil M. P. van der Aalst,et al.  Workflow mining: discovering process models from event logs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[28]  Alexander L. Wolf,et al.  Automating Process Discovery through Event-Data Analysis , 1995, 1995 17th International Conference on Software Engineering.