论文信息 - An Effective Algorithm for Mining Quantitative Association Rules Based on High Dimension Cluster

An Effective Algorithm for Mining Quantitative Association Rules Based on High Dimension Cluster

Mining association rules plays an essential role in data mining tasks. Many algorithms have been proposed for mining Boolean association rules, but they cannot deal with quantitative and categorical data directly. Although we can transform quantitative attributes into intervals and applying Boolean algorithms to the intervals. But this approach is not effective and is difficult to scale up for high-dimensional cases. An efficient algorithm, DBSMiner (density based sub-space miner), is proposed by using the notion of "density- connected" to cluster the high density sub-space of quantitative attributes and gravitation between grid / cluster to deal with the low density cells which may be missed by the previous algorithms, DBSMiner not only can solve the problems of previous approaches, but also can scale up well for high-dimensional cases. Evaluations on DBSMiner have been performed using the car and the shuttle databases maintained at the UCI machine learning repository. The results indicate that DBSMiner is effective and can scale up quite linearly with an increasing number of attributes.

Yulei Huang | Yunkai Guo | Junrui Yang

[1] Shamkant B. Navathe,et al. An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[2] Heikki Mannila,et al. Efficient Algorithms for Discovering Association Rules , 1994, KDD Workshop.

[3] Petra Perner,et al. Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[4] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[5] Rajeev Motwani,et al. Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[6] Aidong Zhang,et al. WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases , 1998, VLDB.

[7] Lars Schmidt-Thieme,et al. On benchmarking frequent itemset mining algorithms: from measurement to analysis , 2005 .

[8] Philip S. Yu,et al. Mining associations by pattern structure in large relational tables , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[9] Dimitrios Gunopulos,et al. Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[10] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[11] Wynne Hsu,et al. Integrating Classification and Association Rule Mining , 1998, KDD.

[12] Jiawei Han,et al. Data Mining: Concepts and Techniques , 2000 .