PC_Tree: Prime-Based and Compressed Tree for Maximal Frequent Patterns Mining

Knowledge discovery or extracting knowledge from large amount of data is a desirable task in competitive businesses. Data mining is an essential step in knowledge discovery process. Frequent patterns play an important role in data min- ing tasks such as clustering, classification, and prediction and association analysis. However, the mining of all frequent patterns will lead to a massive number of pat- terns. A reasonable solution is identifying maximal frequent patterns which form the smallest representative set of patterns to generate all frequent patterns. This research proposes a new method for mining maximal frequent patterns. The method includes an efficient database encoding technique, a novel tree structure called PC Tree and PC Miner algorithm. Experiment results verify the compactness and performance.

[1]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[2]  Ron Rymon,et al.  Search through Systematic Set Enumeration , 1992, KR.

[3]  Zvi M. Kedem,et al.  Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set , 1998, EDBT.

[4]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[5]  Abdul Rauf Baig,et al.  HybridMiner: Mining Maximal Frequent Itemsets Using Hybrid Database Representation Approach , 2005, 2005 Pakistan Section Multitopic Conference.

[6]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[7]  Jiawei Han,et al.  Frequent pattern mining: current status and future directions , 2007, Data Mining and Knowledge Discovery.

[8]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[9]  Ramesh C Agarwal,et al.  Depth first generation of long patterns , 2000, KDD '00.

[10]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[11]  Mohamed Othman,et al.  Fast Discovery Of Long Patterns For Association Rules , 2003, Int. J. Comput. Math..

[12]  Fu-zan Chen,et al.  A Two-Way Hybrid Algorithm for Maximal Frequent Itemsets Mining , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).