Looking into the seeds of time: Discovering temporal patterns in large transaction sets

This paper studies the problem of mining frequent itemsets along with their temporal patterns from large transaction sets. A model is proposed in which users define a large set of temporal patterns that are interesting or meaningful to them. A temporal pattern defines the set of time points where the user expects a discovered itemset to be frequent. The model is general in that (i) no constraints are placed on the interesting patterns given by the users, and (ii) two measures-inclusiveness and exclusiveness-are used to capture how well the temporal patterns match the time points given by the discovered itemsets. Intuitively, these measures indicate to what extent a discovered itemset is frequent at time points included in a temporal pattern p, but not at time points not in p. Using these two measures, one is able to model many temporal data mining problems appeared in the literature, as well as those that have not been studied. By exploiting the relationship within and between itemset space and pattern space simultaneously, a series of pruning techniques are developed to speed up the mining process. Experiments show that these pruning techniques allow one to obtain performance benefits up to 100 times over a direct extension of non-temporal data mining algorithms.

[1]  Heikki Mannila,et al.  Efficient Algorithms for Discovering Association Rules , 1994, KDD Workshop.

[2]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[3]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[4]  Sridhar Ramaswamy,et al.  Cyclic association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[5]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[6]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[7]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[8]  X.S. Wang,et al.  Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences , 1998, IEEE Trans. Knowl. Data Eng..

[9]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[10]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[11]  Jiong Yang,et al.  TAR: temporal association rules on evolving numerical attributes , 2001, Proceedings 17th International Conference on Data Engineering.

[12]  Sushil Jajodia,et al.  Discovering calendar-based temporal association rules , 2001, Proceedings Eighth International Symposium on Temporal Representation and Reasoning. TIME 2001.

[13]  Nandit Soparkar,et al.  Data organization and access for efficient data mining , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[14]  Sridhar Ramaswamy,et al.  On the Discovery of Interesting Patterns in Association Rules , 1998, VLDB.

[15]  Heikki Mannila,et al.  Discovering Frequent Episodes in Sequences , 1995, KDD.

[16]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[17]  Zvi M. Kedem,et al.  Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set , 1998, EDBT.

[18]  Sushil Jajodia,et al.  An Algebraic Representation of Calendars , 2004, Annals of Mathematics and Artificial Intelligence.

[19]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[20]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.