Toward more parallel frequent itemset mining algorithms

This paper presents improvements of the Parallel-FIMI method for statical load balancing of mining of all frequent itemsets on a distributed-memory (DM) parallel machine. This method probabilistically partitions the space of all frequent itemsets into partitions of approximately the same size. The improvements consist in paralelization of the approximate partitioning of the search space and of dynamic reordering of items during construction of prefix-based equivalence classes. The new versions of the method achieve nearly linear speedups up to 10 processors.