A Method to Optimize Apriori Algorithm for Frequent Items Mining

This paper studies the fundamental problems of mining association rules. Based on the summary of classical mining algorithm and the inherent defects of Apriori algorithm, some related improvements are researched. In order to avoid scanning the database multiple times, the database mapping method is changed in this research. Meanwhile, after the support of candidate item sets is get, each candidate item set should be determined whether it is a frequent item set or not based on the prior knowledge of Apriori algorithm. If the candidate item sets generated by the element of the existing frequent item sets are certainly not frequent item sets, the element is not necessary to connect with others, which leads to an optimized connecting step. Lastly, for Apriori algorithm, the intersection operation is introduced to address the disadvantages that it takes many time costs to match with candidate item sets and transaction pattern. Through these improvement strategies, the optimized algorithm is presented and its advantages are explained in theory. And furthermore, to verify the effectiveness, the optimized algorithm has been applied to the floating car data. The experiments results show a shorter execution time and a higher efficiency under different supports and confident levels.

[1]  AgrawalRakesh,et al.  Mining association rules between sets of items in large databases , 1993 .

[2]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[3]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[4]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[5]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[6]  Yang Wei Discovery of Association Rules with Temporal Constraint in Databases , 1999 .

[7]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[8]  R. Koetter,et al.  Data Hiding - Theory and Algorithm , 2007 .

[9]  Chen Yin Minimum-redundant and Lossless Association Rule-set Representation , 2008 .

[10]  Yao Min A new kind of dynamic association rule and its mining algorithms , 2009 .

[11]  Wu Guo-qing Improved Apriori algorithm based on matrix , 2009 .

[12]  Liu Chang-zheng Fast Update Algorithm for Association Rule , 2009 .

[13]  Gong Xiao-lu An Improved Incremental Updating Algorithm for Association Rules , 2009 .

[14]  Yan Li,et al.  Minimum-redundant and Lossless Association Rule-set Representation: Minimum-redundant and Lossless Association Rule-set Representation , 2009 .

[15]  Liang Li,et al.  Research and improvement on Apriori algorithm of association rule mining: Research and improvement on Apriori algorithm of association rule mining , 2010 .

[16]  Zou Hang Research and improvement on Apriori algorithm of association rule mining , 2010 .

[17]  Xu Chi,et al.  Review of association rule mining algorithm in data mining , 2011, 2011 IEEE 3rd International Conference on Communication Software and Networks.

[18]  F. Babic,et al.  Design and implementation of local data mining model for short-term fog prediction at the airport , 2011, 2011 IEEE 9th International Symposium on Applied Machine Intelligence and Informatics (SAMI).

[19]  Liu Bu-zhong Improved apriori mining frequent items algorithm , 2012 .

[20]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .