Domain and data partitioning for parallel mining of frequent closed itemsets

In this paper, we propose an algorithm to partition both the search space and the database for the parallel mining of frequent closed itemsets in large databases. The partitioning of the search space is based on splitting the power set lattice of the total item set to two sub-lattices. Conditional databases axe used to partition the large database. The combination of the search space and database partitioning allows parallel processors to mine the frequent closed itemsets independently and thus minimizes the interprocessor communication and synchronization. The partitioning also ensures the load balance among the parallel processors.

[1]  Mohammed J. Zaki,et al.  Efficiently mining maximal frequent itemsets , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[2]  Srinivasan Parthasarathy,et al.  Parallel Data Mining for Association Rules on Shared-Memory Multi-Processors , 1996, Proceedings of the 1996 ACM/IEEE Conference on Supercomputing.

[3]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[4]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[5]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[6]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[7]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[8]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[9]  Sotirios G. Ziavras,et al.  A super-programming approach for mining association rules in parallel on PC clusters , 2004, IEEE Transactions on Parallel and Distributed Systems.

[10]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[11]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[12]  Setsuo Ohsuga,et al.  INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES , 1977 .

[13]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[14]  Srinivasan Parthasarathy,et al.  A localized algorithm for parallel association mining , 1997, SPAA '97.

[15]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.