Mining Top-K Association Rules

Mining association rules is a fundamental data mining task. However, depending on the choice of the parameters (the minimum confidence and minimum support), current algorithms can become very slow and generate an extremely large amount of results or generate too few results, omitting valuable information.This is a serious problem because in practice users have limited resources for analyzing the results and thus are often only interested in discovering a certain amount of results, and fine tuning the parameters is time-consuming.To address this problem, we propose an algorithm to mine the top-k association rules, where k is the number of association rules to be found and is set by the user. The algorithm utilizes a new approach for generating association rules named rule expansions and includes several optimizations. Experimental results show that the algorithm has excellent performance and scalability, and that it is an advantageous alternative to classical association rule mining algorithms when the user want to control the number of rules generated.

[1]  Yong Qiu,et al.  One database pass algorithms of mining top-k frequent closed itemsets , 2009, 2009 4th International Conference on Computer Science & Education.

[2]  Jiawei Han,et al.  TFP: an efficient algorithm for mining top-k frequent closed itemsets , 2005, IEEE Transactions on Knowledge and Data Engineering.

[3]  Jiawei Han,et al.  TSP: mining top-K closed sequential patterns , 2003, Third IEEE International Conference on Data Mining.

[4]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[5]  Geoffrey I. Webb,et al.  K-Optimal Rule Discovery , 2005, Data Mining and Knowledge Discovery.

[6]  Andrea Pietracaprina,et al.  Efficient Incremental Mining of Top-K Frequent Closed Itemsets , 2007, Discovery Science.

[7]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[8]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[9]  Zhang Jianpei,et al.  Notice of RetractionMining Top-k Fault Tolerant Association Rules by Redundant Pattern Disambiguation in Data Streams , 2010, 2010 International Conference on Intelligent Computing and Cognitive Informatics.

[10]  Salvatore Orlando,et al.  Fast and memory efficient mining of frequent closed itemsets , 2006, IEEE Transactions on Knowledge and Data Engineering.

[11]  Geoffrey I. Webb Filtered‐top‐k association discovery , 2011, Wiley Interdiscip. Rev. Data Min. Knowl. Discov..

[12]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[13]  Jian Pei,et al.  H-Mine: Fast and space-preserving frequent pattern mining in large databases , 2007 .

[14]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[15]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[16]  Ming-Syan Chen,et al.  Mining top-k frequent patterns in the presence of the memory constraint , 2008, The VLDB Journal.