Mining Top-k Distinguishing Temporal Sequential Patterns from Event Sequences

Sequential patterns are useful in many areas such as biomedical sequence analysis, web browsing log analysis, and historical banking transaction log analysis. Distinguishing sequential patterns can help characterize the differences between two or more sets/classes of sequences, and can be used to understand those sequence sets/classes and to identify informative features for classification and so on. However, previous studies have not considered how to mine distinguishing sequential patterns from event sequences, where each event in a sequence has an associated timestamp. To fill that gap, this paper considers the mining of distinguishing temporal event patterns (DTEP) from event sequences. After discussing the challenges on DTEP mining, we present DTEP-Miner, a mining method with various pruning techniques, for mining DTEPs with top-k contrast scores. Our empirical study using both real data and synthetic data demonstrates that DTEP-Miner is effective and efficient.

[1]  Joseph L. Hellerstein,et al.  Mining partially periodic event patterns with unknown periods , 2001, Proceedings 17th International Conference on Data Engineering.

[2]  Xiaosong Li,et al.  Mining Itemset-based Distinguishing Sequential Patterns with Gap Constraint , 2015, DASFAA.

[3]  Ron Rymon,et al.  Search through Systematic Set Enumeration , 1992, KR.

[4]  Osmar R. Zaïane,et al.  Contrasting Sequence Groups by Emerging Sequences , 2009, Discovery Science.

[5]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[6]  Yen-Liang Chen,et al.  Discovering hybrid temporal patterns from sequences consisting of point- and interval-based events , 2009, Data Knowl. Eng..

[7]  Heikki Mannila,et al.  Discovery of Frequent Episodes in Event Sequences , 1997, Data Mining and Knowledge Discovery.

[8]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[9]  Changjie Tang,et al.  Efficient Mining of Density-Aware Distinguishing Sequential Patterns with Gap Constraints , 2014, DASFAA.

[10]  Liang Tang,et al.  Discovering lag intervals for temporal dependencies , 2012, KDD.

[11]  James F. Allen Maintaining knowledge about temporal intervals , 1983, CACM.

[12]  Fabian Mörchen,et al.  Efficient mining of understandable patterns from multivariate interval time series , 2007, Data Mining and Knowledge Discovery.

[13]  Tao Li,et al.  Mining temporal patterns without predefined time windows , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[14]  Jian Pei,et al.  Sequence Data Mining , 2007, Advances in Database Systems.

[15]  James Bailey,et al.  Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints , 2005, ICDM.

[16]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[17]  Taghi M. Khoshgoftaar,et al.  Contrast Pattern Mining with Gap Constraints for Peptide Folding Prediction , 2008, FLAIRS.

[18]  Joseph L. Hellerstein,et al.  Mining mutually dependent patterns , 2001, Proceedings 2001 IEEE International Conference on Data Mining.