Advanced Data Mining and Applications

The existing Apriori algorithm based on matrix still has the problems that the candidate itemsets are too large and matrix takes up too much memory space. To solve these problems, an improved Apriori algorithm based on compression matrix is proposed. The improvement ideas of this algorithm are as follows: (1) reducing the times of scanning matrix set during compressing by adding two arrays to record the counts of 1 in the row and column; (2) minimizing the scale of matrix and improving space utilization by deleting the itemsets which cannot be connected and the infrequent itemsets in compressing matrix; (3) decreasing the errors of the mining result by changing the condition of deleting the unnecessary transaction column;(4) reducing the cycling number of algorithm by changing the stopping condition of program. Instance analysis and experimental results show that the proposed algorithm can accurately and efficiently mines all frequent itemsets in transaction database, and improves the efficiency of mining association rules.