Efficient Algorithms Association for Discovering Rules

Association rules are statements of the form "for 90 % of the rows of the relation, if the row has value 1 in the columns in set W, then it has 1 also in column B". Agrawal, Imielinski, and Swami introduced the problem of mining association rules from large collections of data, and gave a method based on successive passes over the database. We give an improved algorithm for the problem. The method is based on careful combinatorial analysis of the information obtained in previous passes; this makes it possible to eliminate unnecessary candidate rules. Experiments on a university course enrollment database indicate that the method outperforms the previous one by a factor of 5. We also show that sampling is in general a very efficient way of finding such rules.

[1]  William Frawley,et al.  Knowledge Discovery in Databases , 1991 .

[2]  Torben Hagerup,et al.  A Guided Tour of Chernoff Bounds , 1990, Inf. Process. Lett..

[3]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[4]  Kurt Mehlhorn,et al.  The leda user manual , 1996 .

[5]  Editors , 1986, Brain Research Bulletin.

[6]  Bjarne Stroustrup,et al.  The Annotated C++ Reference Manual , 1990 .

[7]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[8]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[9]  Donald W. Loveland Finding Critical Sets , 1987, J. Algorithms.