Timed sequential pattern mining based on confidence in accumulated intervals

Many applications of sequential patterns require a guarantee of a particular event happening within a period of time. We propose CAI-PrefixSpan, a new data mining algorithm to obtain confident timed sequential patterns from sequential databases. Based on PrefixSpan, it takes advantage of the pattern-growth approach. After a particular event sequence, it would first calculate the confidence level regarding the eventual occurrence of a particular event. For those pass the minimal confidence requirement, it then computes the minimal time interval that satisfies the support requirement. It then generates corresponding projected databases, and applies itself recursively on the projected databases. With the timing information, it obtains fewer but more confident sequential patterns. CAI-PrefixSpan is implemented along with PrefixSpan. They are compared in terms of numbers of patterns obtained and execution efficiency. Our effectiveness and performance study shows that CAI-PrefixSpan is a valuable and efficient approach in obtaining timed sequential patterns.

[1]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[2]  Tzung-Pei Hong,et al.  Projection-based partial periodic pattern mining for event sequences , 2013, Expert Syst. Appl..

[3]  M. Teisseire,et al.  Efficient mining of sequential patterns with time constraints: Reducing the combinations , 2009, Expert Syst. Appl..

[4]  Jianyong Wang,et al.  Mining sequential patterns by pattern-growth: the PrefixSpan approach , 2004, IEEE Transactions on Knowledge and Data Engineering.

[5]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[6]  Ming-Tat Ko,et al.  Discovering time-interval sequential patterns in sequence databases , 2003, Expert Syst. Appl..

[7]  Thomas Guyet,et al.  Extracting Temporal Patterns from Interval-Based Sequences , 2011, IJCAI.

[8]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[9]  Céline Fiot,et al.  Extended Time Constraints for Sequence Mining , 2007, 14th International Symposium on Temporal Representation and Reasoning (TIME'07).

[10]  Chichang Jou,et al.  A data mining approach to discovering reliable sequential patterns , 2013, J. Syst. Softw..

[11]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.