A graph distance based metric for data oriented workflow retrieval with variable time constraints

There are many applications in business process management that require measuring the similarity between business processes, such as workflow retrieval and process mining, etc. However, most existing approaches and models cannot represent variable constraints and achieve data oriented workflow retrieval of considering different QoS requirements, and also fail to allow users to express arbitrary constraints based on graph structures of workflows. These problems will impede the customization and reuse of workflows, especially for data oriented workflows. In this paper, we will be towards workflow retrieval with variable time constraints. We propose a graph distance based approach for measuring the similarity between data oriented workflows with variable time constraints. First, a formal structure called Time Dependency Graph (TDG) is proposed and further used as representation model of workflows. Similarity comparison between two workflows can be reduced to computing the similarity between their TDGs. Second, we detect whether two TDGs of workflows for similarity comparison are compatible. A distance based measure is proposed for computing their similarity by their normalization matrices established based on their TDGs. We theoretically proof that the proposed measure satisfies the all the properties of distance. In addition, some exemplar processes are studied to illustrate the effectiveness of our approach of similarity comparison for workflows.

[1]  Carl D. Meyer,et al.  Matrix Analysis and Applied Linear Algebra , 2000 .

[2]  Malgorzata Sterna,et al.  A novel representation of graph structures in web mining and data analysis , 2005 .

[3]  Myoung-Ho Kim,et al.  Improving the performance of time-constrained workflow processing , 2001, J. Syst. Softw..

[4]  Jens von Berg,et al.  Business process integration for distributed applications in radiology , 2001, Proceedings 3rd International Symposium on Distributed Objects and Applications.

[5]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[6]  Weiming Shen,et al.  Integration of workflow and agent technology for business process management , 2001, Proceedings of the Sixth International Conference on Computer Supported Cooperative Work in Design (IEEE Cat. No.01EX472).

[7]  Edward A. Lee,et al.  CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2000; 00:1–7 Prepared using cpeauth.cls [Version: 2002/09/19 v2.02] Taverna: Lessons in creating , 2022 .

[8]  Boudewijn F. van Dongen,et al.  Workflow mining: A survey of issues and approaches , 2003, Data Knowl. Eng..

[9]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[10]  Liang-Jie Zhang,et al.  Development of Distance Measures for Process Mining, Discovery and Integration , 2007, Int. J. Web Serv. Res..

[11]  J. Steele The Cauchy–Schwarz Master Class: References , 2004 .

[12]  Gregory Gutin,et al.  Digraphs - theory, algorithms and applications , 2002 .

[13]  Harold Boley,et al.  Combined Structure-Weight Graph Similarity and its Application in E-Health , 2013, CSWS.

[14]  Yogesh L. Simmhan,et al.  Karma2: Provenance Management for Data-Driven Workflows , 2008, Int. J. Web Serv. Res..

[15]  Weiming Shen,et al.  An agent-based Web service workflow model for inter-enterprise collaboration , 2006, Expert Syst. Appl..

[16]  Catriel Beeri,et al.  Querying Business Processes with BP-QL , 2005, VLDB.

[17]  John F. Roddick,et al.  Journal of Graph Algorithms and Applications Fp-graphminer – a Fast Frequent Pattern Mining Algorithm for Network Graphs , 2022 .

[18]  Domenico Saccà,et al.  Mining and reasoning on workflows , 2005, IEEE Transactions on Knowledge and Data Engineering.

[19]  Jianmin Wang,et al.  Mining process models with non-free-choice constructs , 2007, Data Mining and Knowledge Discovery.

[20]  Roque Marín,et al.  Querying Clinical Workflows by Temporal Similarity , 2007, AIME.

[21]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[22]  Dickson K. W. Chiu,et al.  Developing workflow-based information integration (WII) with exception support in a Web services environment , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[23]  Aleksander Slominski Adapting BPEL to Scientific Workflows , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[24]  Raghava Rao Mukkamala,et al.  From Paper Based Clinical Practice Guidelines to Declarative Workflow Management , 2008, Business Process Management Workshops.

[25]  Yi Chen,et al.  Searching workflows with hierarchical views , 2010, Proc. VLDB Endow..

[26]  Henry C. W. Lau,et al.  Design and development of logistics workflow systems for demand management with RFID , 2011, Expert Syst. Appl..

[27]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[28]  Andrea Freßmann,et al.  Adaptive Workflow Support for Search Processes within Fire Service Organisations , 2006, 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE'06).

[29]  Horst Bunke,et al.  A Comparison of Algorithms for Maximum Common Subgraph on Randomly Connected Graphs , 2002, SSPR/SPR.

[30]  Sherif Sakr,et al.  Querying Graph-Based Repositories of Business Process Models , 2010, DASFAA Workshops.

[31]  Marc Ehrig,et al.  Measuring Similarity between Semantic Business Process Models , 2007, APCCM.

[32]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[33]  Rob J. van Glabbeek,et al.  Branching time and abstraction in bisimulation semantics , 1996, JACM.

[34]  J. Leon Zhao,et al.  A case-based reasoning framework for workflow model management , 2004, Data Knowl. Eng..

[35]  Roque Marín,et al.  Temporal similarity measures for querying clinical workflows , 2009, Artif. Intell. Medicine.

[36]  Kaizhong Zhang,et al.  On the Editing Distance Between Undirected Acyclic Graphs , 1996, Int. J. Found. Comput. Sci..

[37]  Wil M. P. van der Aalst,et al.  Quantifying process equivalence based on observed behavior , 2008, Data Knowl. Eng..

[38]  Martin Schaaf,et al.  The PROGEMM approach for managing clinical processes , 2003, WET ICE 2003. Proceedings. Twelfth IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, 2003..

[39]  Geoffrey C. Fox,et al.  Examining the Challenges of Scientific Workflows , 2007, Computer.

[40]  Ralph Bergmann,et al.  Retrieval of Semantic Workflows with Knowledge Intensive Similarity Measures , 2011, ICCBR.

[41]  H. Anton,et al.  Elementary linear algebra : applications version , 2008 .

[42]  H. Anton Elementary Linear Algebra , 1970 .

[43]  Shikun Zhang,et al.  A Workflow Process Mining Algorithm Based on Synchro-Net , 2006, Journal of Computer Science and Technology.

[44]  Jennifer Widom,et al.  Provenance for Generalized Map and Reduce Workflows , 2011, CIDR.

[45]  David B. Leake,et al.  Towards Case-Based Support for e-Science Workflow Generation by Mining Provenance , 2008, ECCBR.

[46]  Bertram Ludäscher,et al.  Scientific workflow design for mere mortals , 2009, Future Gener. Comput. Syst..

[47]  Alexander L. Wolf,et al.  Discovering models of software processes from event-based data , 1998, TSEM.

[48]  Anne H. H. Ngu,et al.  Enabling ScientificWorkflow Reuse through Structured Composition of Dataflow and Control-Flow , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[49]  Remco M. Dijkman,et al.  Measuring Similarity between Business Process Models , 2008, CAiSE.

[50]  Carole A. Goble,et al.  Workflow discovery: the problem, a case study from e-Science and a graph-based solution , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[51]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2004, Distributed and Parallel Databases.