Using a knowledge cache for interactive discovery of association rules

Association rule mining is a valuable decision support technique that can be used to analyze customer preferences, buying patterns, and product correlations. Current systems are however handicapped by the long processing times required by mining algorithms that make them unsuitable for interactive use. In this paper, we propose the use of a knowledge cache that can reduce the response time by several orders of magnitude. Most of the performance gain comes from the idea of guaranteed support that allows us to completely eliminate database accesses in a large number of cases. Using this cache, the time taken to answer a query is proportional to just the size of the result, rather than to the size of the database. Cache replacement is best done by a benefit-metric based strategy that can easily adapt to changing query patterns. We show that our caching scheme is quite robust, providing good performance on a wide variety of data distributions even for small cache sizes. We also compare algorithms that use precomputation to those that use caching and show that the best performance is obtained by combining both these techniques. Finally, we illustrate how the idea of caching can be readily extended to a broader class of problems such as the mining of generalized association rules.

[1]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[2]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[3]  Venky Harinarayan,et al.  Implementing Data Cubes E ciently , 1996 .

[4]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[5]  Philip S. Yu,et al.  Online generation of association rules , 1998, Proceedings 14th International Conference on Data Engineering.

[6]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[7]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[8]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[9]  David J. DeWitt,et al.  Shoring up persistent applications , 1994, SIGMOD '94.

[10]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[11]  Jeffrey F. Naughton,et al.  Caching multidimensional queries using chunks , 1998, SIGMOD '98.

[12]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[13]  Christian Hidber,et al.  Association Rule Mining , 2017 .

[14]  Jiawei Han,et al.  Towards on-line analytical mining in large databases , 1998, SGMD.

[15]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[16]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[17]  HanJiawei,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998 .

[18]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.