An Efficient Algorithm for Mining Maximal Frequent Item Sets

Problem Statement: In today's life, the mining of frequent patterns is a basic problem in data mining applications. The algorithms which are used to generate these frequent patterns must perform efficiently. The objective was to propose an effective algorithm which generates frequent patterns in less time. Approach: We proposed an algorithm which was based on hashing technique and combines a vertical tidset representation of the database with effective pruning mechanisms. It removes all the non-maximal frequent item-sets to get exact set of MFI directly. It worked efficiently when the number of item-sets and tid-sets is more. Results: The performance of our algorithm had been compared with recently developed MAFIA algorithm and the results show how our algorithm gives better performance. Conclusions: Hence, the proposed algorithm performs effectively and generates frequent patterns faster.

[1]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[2]  Ben Taskar,et al.  Learning Probabilistic Models of Relational Structure , 2001, ICML.

[3]  Mohammed J. Zaki,et al.  Efficiently mining maximal frequent itemsets , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[4]  Philip S. Yu,et al.  Online generation of association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[5]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[6]  Nandit Soparkar,et al.  Data organization and access for efficient data mining , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[7]  Heikki Mannila,et al.  A Perspective on Databases and Data Mining , 1995, KDD.

[8]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[9]  David M. Pennock,et al.  Towards Structural Logistic Regression: Combining Relational and Statistical Learning , 2002 .

[10]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[11]  Heikki Mannila,et al.  Verkamo: Fast Discovery of Association Rules , 1996, KDD 1996.

[12]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[13]  Philip S. Yu,et al.  Mining Large Itemsets for Association Rules , 1998, IEEE Data Eng. Bull..

[14]  Ben Taskar,et al.  Probabilistic Classification and Clustering in Relational Data , 2001, IJCAI.

[15]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[16]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[17]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[18]  Ramesh C Agarwal,et al.  Depth first generation of long patterns , 2000, KDD '00.

[19]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[20]  Johannes Gehrke,et al.  DEMON: Mining and Monitoring Evolving Data , 2001, IEEE Trans. Knowl. Data Eng..

[21]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[22]  Devavrat Shah,et al.  Turbo-charging vertical mining of large databases , 2000, SIGMOD '00.

[23]  Dimitrios Gunopulos,et al.  Discovering All Most Specific Sentences by Randomized Algorithms , 1997, ICDT.

[24]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[25]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[26]  Gerd Stumme,et al.  Mining frequent patterns with counting inference , 2000, SKDD.

[27]  Charu C. Aggarwal,et al.  A Tree Projection Algorithm for Generation of Frequent Item Sets , 2001, J. Parallel Distributed Comput..

[28]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[29]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[30]  Zvi M. Kedem,et al.  Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set , 1998, EDBT.