论文信息 - A Cluster Algorithm Identifying the Clustering Structure

A Cluster Algorithm Identifying the Clustering Structure

Cluster analysis is a primary method for database mining. Most of clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, for many real-datasets there does not exist a global parameter setting for which the result of the clustering algorithm describes the intrinsic clustering structure accurately. We introduce a new algorithm which produces a clustering explicitly. The algorithm first gets the approximate density of every point using the grid, and then uses k-means algorithm to get the boundary of cluster structure with the data of point density, at last it uses values of boundary as the parameters of the next step which can get the finical cluster result. Both theory analysis and experimental results confirm CluICS can cluster data of varying density with automatic setting different parameters in different partitions and its efficiency is much higher than DBSCAN algorithm.

Zhi-Wei Sun | Zhi-Wei Sun

[1] Hans-Peter Kriegel,et al. OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[2] Zhiwei Sun,et al. A fast clustering algorithm based on grid and density , 2005, Canadian Conference on Electrical and Computer Engineering, 2005..

[3] Jiong Yang,et al. STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[4] Cao Jing,et al. Approaches for scaling DBSCAN algorithm to large spatial databases , 2000 .

[5] Hans-Peter Kriegel,et al. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[6] Howard J. Hamilton,et al. DBRS: A Density-Based Spatial Clustering Method with Random Sampling , 2003, PAKDD.

[7] Jiong Yang,et al. An Approach to Active Spatial Data Mining Based on Statistical Information , 2000, IEEE Trans. Knowl. Data Eng..