Similarity Measures to Compare Episodes in Modeled Traces

This paper reports on a similarity measure to compare episodes in modeled traces. A modeled trace is a structured record of observations captured from users’ interactions with a computer system. An episode is a sub-part of the modeled trace, describing a particular task performed by the user. Our method relies on the definition of a similarity measure for comparing elements of episodes, combined with the implementation of the Smith-Waterman Algorithm for comparison of episodes. This algorithm is both accurate in terms of temporal sequencing and tolerant to noise generally found in the traces that we deal with. Our evaluations show that our approach offers quite satisfactory comparison quality and response time. We illustrate its use in the context of an application for video sequences recommendation.

[1]  Konrad Rieck,et al.  Similarity measures for sequential data , 2011, WIREs Data Mining Knowl. Discov..

[2]  Luc Lamontagne,et al.  Case Retrieval Reuse Net (CR2N): An Architecture for Reuse of Textual Solutions , 2009, ICCBR.

[3]  Miquel Sànchez-Marrè,et al.  An Approach for Temporal Case-Based Reasoning: Episode-Based Reasoning , 2005, ICCBR.

[4]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[5]  Richard W. Hamming,et al.  Error detecting and error correcting codes , 1950 .

[6]  Bruno S. Silvestre,et al.  Social Media? Get Serious! Understanding the Functional Building Blocks of Social Media , 2011 .

[7]  Santiago Ontañón,et al.  Natural Language Generation through Case-Based Text Modification , 2012, ICCBR.

[8]  M Damashek,et al.  Gauging Similarity with n-Grams: Language-Independent Categorization of Text , 1995, Science.

[9]  Bernhard Schölkopf,et al.  Dynamic Alignment Kernels , 2000 .

[10]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[11]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[12]  Luc Lamontagne,et al.  Case-Based Reasoning Research and Development , 1997, Lecture Notes in Computer Science.

[13]  Pierre-Antoine Champin,et al.  TStore: A Trace-Base Management System - Using Finite-state Transducer Approach for Trace Transformation , 2013, MODELSWARD.

[14]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[15]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[16]  A. H. Lipkus A proof of the triangle inequality for the Tanimoto distance , 1999 .

[17]  Mirjam Minor,et al.  Confidence in Workflow Adaptation , 2012, ICCBR.

[18]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[19]  Tomoko Matsui,et al.  A Kernel for Time Series Based on Global Alignments , 2006, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[20]  A. Mille,et al.  MUSETTE : a framework for Knowledge from Experience , 2004 .

[21]  Stefania Montani,et al.  Retrieval and clustering for supporting business process adjustment and analysis , 2014, Inf. Syst..