The Frequent Pattern List: Another Framework for Mining Frequent Patterns

The mining of frequent patterns (or frequent itemsets) plays an essential role in many tasks of data mining. One major methodology for mining frequent patterns is the Apriori-based approach, which is computationally costly because many candidate itemsets have to be generated and verified. More recently, another approach using the Frequent-Pattern Tree (FP-tree) have been suggested to avoid the generation of candidate itemsets, but at the cost of working with more complex data structures. In this paper, we propose a simpler and more efficient data structure for representing the databases --- the Frequent Pattern List (FPL). The FPL is able to partition both the search space and the solution space so that a divide-and-conquer approach can be applied in mining frequent patterns. With simple operations performed on FPL, frequent patterns can be easily discovered. For a comparative study, we also elaborate the essential differences between FPL and FP-tree in memory requirement, the number of recursive calls, and run time. Experimental results show that our method has satisfactory performances in all these respects. At the end of this paper, we also explore the possible extensions of the Frequent Pattern List in mining dense and large databases.

[1]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[2]  Ching-Chi Hsu,et al.  Generating Frequent Patterns with the Frequent Pattern List , 2001, PAKDD.

[3]  Ching-Chi Hsu,et al.  Efficiently Mining Frequent Closed Itemsets by Eliminating Data Redundancies , 2005 .

[4]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[5]  Charu C. Aggarwal,et al.  A Tree Projection Algorithm for Generation of Frequent Item Sets , 2001, J. Parallel Distributed Comput..

[6]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[7]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[8]  Philip S. Yu,et al.  Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997, IEEE Trans. Knowl. Data Eng..

[9]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[10]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[11]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[12]  Hongjun Lu,et al.  H-mine: hyper-structure mining of frequent patterns in large databases , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[13]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[14]  Henry Chen,et al.  Mining frequent closed itemsets with the frequent pattern list , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[15]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[16]  Dimitris Meretakis,et al.  Extending naïve Bayes classifiers using long itemsets , 1999, KDD '99.

[17]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[18]  Bart Goethals,et al.  A tight upper bound on the number of candidate patterns , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[19]  Daniel A. Keim,et al.  On Knowledge Discovery and Data Mining , 1997 .

[20]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[21]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[22]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[23]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.