Workflow Clustering Method Based on Process Similarity

Process-centric information systems have been accumulating a mount of process models. Process designers continue to create new process models and they long for process analysis tools in various viewpoints. This paper proposes a novel approach of process analysis. Workflow clustering facilitates to analyze accumulated workflow process models and classify them into characteristic groups. The framework consists of two phases: domain classification and pattern analysis. Domain classification exploits an activity similarity measure, while pattern analysis does a transition similarity measure. Process models are represented as weighted complete dependency graphs, and then similarities among their graph vectors are estimated in consideration of relative frequency of each activity and transition. Finally, the models are clustered based on the similarities by a hierarchical clustering algorithm. We implemented the methodology and experimented sets of synthetic processes. Workflow clustering is adaptable to various process analyses, such as workflow recommendation, workflow mining, and process patterns analysis.

[1]  Boudewijn F. van Dongen,et al.  Workflow mining: A survey of issues and approaches , 2003, Data Knowl. Eng..

[2]  Jorge Cardoso,et al.  How to Measure the Control-flow Complexity of Web Processes and Workflows , 2005 .

[3]  Suk-Ho Kang,et al.  Business process choreography for B2B collaboration , 2004, IEEE Internet Computing.

[4]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[5]  Kaizhong Zhang,et al.  Simple Fast Algorithms for the Editing Distance Between Trees and Related Problems , 1989, SIAM J. Comput..

[6]  Hajo A. Reijers,et al.  Cohesion and Coupling Metrics for Workflow Process Design , 2004, Business Process Management.

[7]  Hyerim Bae,et al.  Automatic control of workflow processes using ECA rules , 2004, IEEE Transactions on Knowledge and Data Engineering.

[8]  Byung-Hyun Ha,et al.  Development of process execution rules for workload balancing on agents , 2006, Data Knowl. Eng..

[9]  Joonsoo Bae,et al.  WW-FLOW: Web-Based Workflow Management with Runtime Encapsulation , 2000, IEEE Internet Comput..

[10]  Kevin Crowston,et al.  Organizing Business Knowledge: The MIT Process Handbook , 2003 .

[11]  Mohamed S. Kamel,et al.  Efficient phrase-based document indexing for Web document clustering , 2004, IEEE Transactions on Knowledge and Data Engineering.

[12]  Siu-Ming Yiu,et al.  An efficient and scalable algorithm for clustering XML documents by structure , 2004, IEEE Transactions on Knowledge and Data Engineering.

[13]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[14]  Timos K. Sellis,et al.  State-space optimization of ETL workflows , 2005, IEEE Transactions on Knowledge and Data Engineering.

[15]  Wil M. P. van der Aalst,et al.  Process mining: a research agenda , 2004, Comput. Ind..

[16]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[17]  Hyerim Bae,et al.  Customizable Workflow Monitoring , 2003, Concurr. Eng. Res. Appl..