AIM: Another Itemset Miner

We present a new algorithm for mining frequent itemsets. Past studies have proposed various algorithms and techniques for improving the efficiency of the mining task. We integrate a combination of these techniques into an algorithm which utilize those techniques dynamically according to the input dataset. The algorithm main features include depth first search with vertical compressed database, diffset, parent equivalence pruning, dynamic reordering and projection. Experimental testing suggests that our algorithm and implementation significantly outperform existing algorithms/implementations.

[1]  Zvi M. Kedem,et al.  Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set , 1998, EDBT.

[2]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[3]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[4]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[5]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[6]  Arbee L. P. Chen,et al.  An efficient approach to discovering knowledge from large databases , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[7]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[8]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[9]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[10]  Devavrat Shah,et al.  Turbo-charging vertical mining of large databases , 2000, SIGMOD '00.

[11]  Ron Rymon,et al.  Search through Systematic Set Enumeration , 1992, KR.

[12]  George Karypis,et al.  SLPMiner: an algorithm for finding frequent sequential patterns using length-decreasing support constraint , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[13]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[14]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.