论文信息 - Mining association rules: anti-skew algorithms

Mining association rules: anti-skew algorithms

Mining association rules among items in a large database has been recognized as one of the most important data mining problems. All proposed approaches for this problem require scanning the entire database at least or almost twice in the worst case. We propose several techniques which overcome the problem of data skew in the basket data. These techniques reduce the maximum number of scans to less than 2, and in most cases find all association rules in about 1 scan. Our algorithms employ prior knowledge collected during the mining process and/or via sampling, to further reduce the number of candidate itemsets and identify false candidate itemsets at an earlier stage.

Jun-Lin Lin | Margaret H. Dunham

[1] Shamkant B. Navathe,et al. An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[2] Srinivasan Parthasarathy,et al. Evaluation of sampling for data mining of association rules , 1997, Proceedings Seventh International Workshop on Research Issues in Data Engineering. High Performance Database Management for Large-Scale Applications.

[3] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[4] Heikki Mannila,et al. Efficient Algorithms for Discovering Association Rules , 1994, KDD Workshop.

[5] Philip S. Yu,et al. Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[6] Hannu Toivonen,et al. Sampling Large Databases for Association Rules , 1996, VLDB.

[7] Andreas Mueller,et al. Fast sequential and parallel algorithms for association rule mining: a comparison , 1995 .

[8] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.