A knowledge-intensive approach to process similarity calculation

Abstract Process model comparison and similar processes retrieval are key issues to be addressed in many real world situations, and particularly relevant ones in some applications (e.g., in medicine), where similarity quantification can be exploited in a quality assessment perspective. Most of the process comparison techniques described in the literature suffer from two main limitations : (1) they adopt a purely syntactic (vs. semantic) approach in process activity comparison, and/or (2) they ignore complex control flow information (i.e., other than sequence). These limitations oversimplify the problem, and make the results of similarity-based process retrieval less reliable, especially when domain knowledge is available, and can be adopted to quantify activity or control flow construct differences. In this paper, we aim at overcoming both limitations , by introducing a framework which allows to extract the actual process model from the available process execution traces, through process mining techniques, and then to compare (mined) process models, by relying on a novel distance measure . The novel distance measure, which represents the main contribution of this paper, is able to address issues (1) and (2) above, since: (1) it provides a semantic, knowledge-intensive approach to process activity comparison, by making use of domain knowledge; (2) it explicitly takes into account complex control flow constructs (such as AND and XOR splits/joins), thus fully considering the different semantic meaning of control flow connections in a reliable way. The positive impact of the framework in practice has been tested in stroke management, where our approach has outperformed a state-of-the art literature metric on a real world event log, providing results that were closer to those of a human expert. Experiments in other domains are foreseen in the future.

[1]  Jörg Becker,et al.  On Measures of Behavioral Distance between Business Processes , 2011, Wirtschaftsinformatik.

[2]  Janet L. Kolodner,et al.  Case-Based Reasoning , 1989, IJCAI 1989.

[3]  Alexander Tartakovski,et al.  Agile Workflow Technology and Case-Based Change Reuse for Long-Term Processes , 2008, Int. J. Intell. Inf. Technol..

[4]  Martha Palmer,et al.  Verb semantics for English-Chinese translation , 1995, Machine Translation.

[5]  Norbert Martínez-Bazan,et al.  DEX: A high-performance graph database management system , 2011, 2011 IEEE 27th International Conference on Data Engineering Workshops.

[6]  Li Yujian,et al.  A Normalized Levenshtein Distance Metric , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Ralph Bergmann,et al.  Similarity assessment and efficient retrieval of semantic workflows , 2014, Inf. Syst..

[8]  David Sánchez,et al.  Ontology-based semantic similarity: A new feature-based approach , 2012, Expert Syst. Appl..

[9]  Brian Knight,et al.  A Case Based Reasoning Approach for the Monitoring of Business Workflows , 2010, ICCBR.

[10]  Gabriel Valiente,et al.  Algorithms on Trees and Graphs , 2002, Springer Berlin Heidelberg.

[11]  David Sánchez,et al.  A semantic similarity method based on information content exploiting multiple ontologies , 2013, Expert Syst. Appl..

[12]  Boudewijn F. van Dongen,et al.  The ProM Framework: A New Era in Process Mining Tool Support , 2005, ICATPN.

[13]  Ralph Bergmann,et al.  Case-Based Support for Collaborative Business , 2006, ECCBR.

[14]  Ricardo Seguel,et al.  Process Mining Manifesto , 2011, Business Process Management Workshops.

[15]  Boudewijn F. van Dongen,et al.  Workflow mining: A survey of issues and approaches , 2003, Data Knowl. Eng..

[16]  Jason J. Jung Semantic business process integration based on ontology alignment , 2009, Expert Syst. Appl..

[17]  Remco M. Dijkman,et al.  Business Process Model Merging: An Approach to Business Process Consolidation , 2013, TSEM.

[18]  Mathias Weske,et al.  Metric Trees for Efficient Similarity Search in Large Process Model Repositories , 2010, Business Process Management Workshops.

[19]  Boudewijn F. van Dongen,et al.  Multi-phase Process Mining: Building Instance Graphs , 2004, ER.

[20]  Daniele Theseider Dupré,et al.  Semantic similarity in heterogeneous ontologies , 2011, HT '11.

[21]  Wil M.P. van der Aalst,et al.  Process mining with the HeuristicsMiner algorithm , 2006 .

[22]  Alessandro Sperduti,et al.  Heuristics Miner for Time Intervals , 2010, ESANN.

[23]  Kenneth D. Forbus,et al.  MAC/FAC: A Model of Similarity-Based Retrieval , 1995, Cogn. Sci..

[24]  J. Leon Zhao,et al.  A case-based reasoning framework for workflow model management , 2004, Data Knowl. Eng..

[25]  Carole A. Goble,et al.  Workflow discovery: the problem, a case study from e-Science and a graph-based solution , 2006, 2006 IEEE International Conference on Web Services (ICWS'06).

[26]  Remco M. Dijkman,et al.  Similarity of business process models: Metrics and evaluation , 2011, Inf. Syst..

[27]  Remco M. Dijkman,et al.  Graph Matching Algorithms for Business Process Model Similarity Search , 2009, BPM.

[28]  Ralf Laue,et al.  A comparative survey of business process similarity measures , 2012, Comput. Ind..

[29]  Enrique Vidal,et al.  Computation of Normalized Edit Distance and Applications , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Erhard Rahm,et al.  Similarity flooding: a versatile graph matching algorithm and its application to schema matching , 2002, Proceedings 18th International Conference on Data Engineering.

[31]  Silvana Quaglini,et al.  The Lombardia Stroke Unit Registry: 1-year experience of a web-based hospital stroke registry , 2010, Neurological Sciences.

[32]  Manfred Reichert,et al.  On Measuring Process Model Similarity Based on High-Level Change Operations , 2007, ER.

[33]  Horst Bunke,et al.  On a relation between graph edit distance and maximum common subgraph , 1997, Pattern Recognit. Lett..

[34]  Ling Liu,et al.  Process Mining by Measuring Process Block Similarity , 2006, Business Process Management Workshops.

[35]  Borislav Iordanov,et al.  HyperGraphDB: A Generalized Graph Database , 2010, WAIM Workshops.

[36]  Yinglong Ma,et al.  A graph distance based metric for data oriented workflow retrieval with variable time constraints , 2014, Expert Syst. Appl..

[37]  Paolo Terenziani,et al.  ChAPMaN: a Context Aware Process MiNer , 2014 .