Mining Precise-Positioning Episode Rules from Event Sequences

Episode Rule Mining is a popular framework for discovering sequential rules from event sequential data. However, traditional episode rule mining methods only tell that the consequent event is likely to happen within a given time interval after the occurrence of the antecedent events. As a result, they cannot satisfy the requirement of many time sensitive applications, such as program security trading and intelligent transportation management due to the lack of fine-grained response time. In this study, we come up with the concept of fixed-gap episode to address this problem. A fixed-gap episode consists of an ordered set of events where the elapsed time between any two consecutive events is a constant. Based on this concept, we formulate the problem of mining precise-positioning episode rules in which the occurrence time of each event in the consequent is clearly specified. In addition, we develop a trie-based data structure to mine such precise-positioning episode rules with several pruning strategies incorporated for improving the performance as well as reducing memory consumption. Experimental results on real datasets show the superiority of our proposed algorithms.

[1]  Gemma Casas-Garriga Discovering Unbounded Episodes in Sequential Data , 2003 .

[2]  Longbing Cao,et al.  Mining Partially-Ordered Sequential Rules Common to Multiple Sequences , 2015, IEEE Trans. Knowl. Data Eng..

[3]  Jiadong Ren,et al.  Mining sequential patterns with periodic wildcard gaps , 2014, Applied Intelligence.

[4]  P. S. Sastry,et al.  Discovering Frequent Generalized Episodes When Events Persist for Different Durations , 2007, IEEE Transactions on Knowledge and Data Engineering.

[5]  Christophe Rigotti,et al.  Mining episode rules in STULONG dataset , 2004 .

[6]  Avinash Achar,et al.  Pattern-growth based frequent serial episode discovery , 2013, Data Knowl. Eng..

[7]  Ada Wai-Chee Fu,et al.  Mining Frequent Episodes for Relating Financial Events and Stock Trends , 2003, PAKDD.

[8]  Chia-Hui Chang,et al.  Efficient mining of frequent episodes from complex sequences , 2008, Inf. Syst..

[9]  Boris Cule,et al.  Mining Closed Strict Episodes , 2010, ICDM.

[10]  Vincent S. Tseng,et al.  Discovering utility-based episode rules in complex event sequences , 2015, Expert Syst. Appl..

[11]  Christophe Rigotti,et al.  Constraint-Based Mining of Episode Rules and Optimal Window Sizes , 2004, PKDD.

[12]  Heikki Mannila,et al.  Discovering Generalized Episodes Using Minimal Occurrences , 1996, KDD.

[13]  Raajay Viswanathan,et al.  Discovering injective episodes with general partial orders , 2011, Data Mining and Knowledge Discovery.

[14]  Ryen W. White,et al.  Stream prediction using a generative model based on frequent episodes in event sequences , 2008, KDD.

[15]  Fuzhen Zhuang,et al.  Online Frequent Episode Mining , 2015, 2015 IEEE 31st International Conference on Data Engineering.

[16]  Boris Cule,et al.  Mining closed episodes with simultaneous events , 2011, KDD.

[17]  Fuzhen Zhuang,et al.  Discovering and learning sensational episodes of news events , 2014, WWW.

[18]  Heikki Mannila,et al.  Discovery of Frequent Episodes in Event Sequences , 1997, Data Mining and Knowledge Discovery.

[19]  Philip S. Yu,et al.  Discovering Frequent Closed Partial Orders from Strings , 2006, IEEE Transactions on Knowledge and Data Engineering.

[20]  Avinash Achar,et al.  A unified view of Automata-based algorithms for Frequent Episode Discovery , 2010, ArXiv.

[21]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[22]  Philip S. Yu,et al.  Mining high utility episodes in complex event sequences , 2013, KDD.

[23]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[24]  Kian-Lee Tan,et al.  Finding constrained frequent episodes using minimal occurrences , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[25]  Aris Floratos,et al.  Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm [published erratum appears in Bioinformatics 1998;14(2): 229] , 1998, Bioinform..

[26]  Klaus Berberich,et al.  Mind the gap: large-scale frequent sequence mining , 2013, SIGMOD '13.

[27]  Chengqi Zhang,et al.  Mining Dependent Frequent Serial Episodes from Uncertain Sequence Data , 2013, 2013 IEEE 13th International Conference on Data Mining.

[28]  Ming Li,et al.  Efficient Mining of Gap-Constrained Subsequences and Its Various Applications , 2012, TKDD.

[29]  Avinash Achar,et al.  A unified view of the apriori-based algorithms for frequent episode discovery , 2011, Knowledge and Information Systems.

[30]  Shivakumar Sastry,et al.  Discovering compressing serial episodes from event sequences , 2015, Knowledge and Information Systems.