Mining Top-Rank-K Frequent Patterns

There have been many studies on efficient discovery of frequent patterns in large databases. The usual framework is to use a minimal support threshold to obtain all frequent patterns. However, it is nontrivial for users to choose a suitable minimal support threshold. In this paper, a new mining task called mining top-rank-k frequent patterns, where k is the biggest rank value of all frequent patterns to be mined, has been proposed. After deep analyzing the properties of top-rank-k frequent patterns, we propose an efficient algorithm called FAE to mining top-rank-k frequent patterns. FAE is the abbreviation of "Filtering and Extending ". During the mining process of FAE, the undesired patterns are filtered and useful patterns are selected to generate other longer potential frequent patterns. This strategy greatly reduces the search space. We also present results of applying these algorithms to a synthetic data set, which show the effectiveness of our algorithms.

[1]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[2]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[3]  Ferenc Bodon,et al.  Surprising Results of Trie-based FIM Algorithms , 2004, FIMI.

[4]  Jiawei Han,et al.  TSP: mining top-K closed sequential patterns , 2003, Third IEEE International Conference on Data Mining.

[5]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[6]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[7]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[8]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[9]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[10]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[11]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[12]  Jiawei Han,et al.  Mining top-k frequent closed patterns without minimum support , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[13]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[14]  Walid G. Aref Mining Association Rules in Large Databases , 2004 .

[15]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[16]  Jiawei Han,et al.  TFP: an efficient algorithm for mining top-k frequent closed itemsets , 2005, IEEE Transactions on Knowledge and Data Engineering.