Fast mining of association rules in large-scale problems

We study the scalability problem of a priori-like algorithms that are used in mining association rules. We show how a priori suffers from performance deterioration for large-scale problems and propose alternative data structures and operations that can be used to apply the a priori-trick optimization method in large-scale problems. In the proposed method, the database is transformed into a more efficient structure that is used along with the intersection operation, to find the frequent itemsets in the database. The performance evaluation shows that, with a minor increase in the storage requirement, the proposed technique outperforms significantly the existing algorithms especially in large-scale problems.

[1]  Philip S. Yu,et al.  Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997, IEEE Trans. Knowl. Data Eng..

[2]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[3]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[4]  Masaru Kitsuregawa,et al.  Parallel mining algorithms for generalized association rules with classification hierarchy , 1997, SIGMOD '98.

[5]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[6]  Chris Clifton,et al.  Query flocks: a generalization of association-rule mining , 1998, SIGMOD '98.

[7]  Vipin Kumar,et al.  Scalable parallel data mining for association rules , 1997, SIGMOD '97.

[8]  Johannes Gehrke,et al.  Mining Very Large Databases , 1999, Computer.

[9]  Sunita Sarawagi,et al.  Mining Generalized Association Rules and Sequential Patterns Using SQL Queries , 1998, KDD.

[10]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[11]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[12]  Kyuseok Shim,et al.  Developing Tightly-Coupled Data Mining Applications on a Relational Database System , 1996, KDD.

[13]  Sunita Sarawagi,et al.  Integrating association rule mining with relational database systems: alternatives and implications , 1998, SIGMOD '98.

[14]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[15]  Jiawei Han,et al.  Maintenance of discovered association rules in large databases: an incremental updating technique , 1996, Proceedings of the Twelfth International Conference on Data Engineering.