GDCLU: A New Grid-Density Based ClustrIng Algorithm

This paper addresses the density based clustering problem in data mining where clusters are established based on density of regions. The most well-known algorithm proposed in this area is DBSCAN [1] which employs two parameters influencing the shape of resulted clusters. Therefore, one of the major weaknesses of this algorithm is lack of ability to handle clusters in multi-density environments. In this paper, a new density based grid clustering algorithm, GDCLU, is proposed which uses a new definition for dense regions. It determines dense grids based on densities of their neighbors. This new definition enables GDCLU to handle different shaped clusters in multi-density environments. Also this algorithm benefits from scale independency feature. The time complexity of the algorithm is O(n) in which n is number of points in dataset. Several examples are presented showing promising improvement in performance over other basic algorithms like optics in multi-density environments.

[1]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[2]  F. Ashcroft,et al.  VIII. References , 1955 .

[3]  Pricing Method and Strategy of Catastrophe Insurance Securitization in China , 2007, 2007 International Conference on Service Systems and Service Management.

[4]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[5]  Hassan Abolhassani,et al.  MSDBSCAN: Multi-density Scale-Independent Clustering Algorithm Based on DBSCAN , 2010, ADMA.

[6]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[7]  Chen Xiaoyun,et al.  GMDBSCAN: Multi-Density DBSCAN Cluster Based on Grid , 2008, ICEBE.

[8]  Peng Liu,et al.  VDBSCAN: Varied Density Based Spatial Clustering of Applications with Noise , 2007, 2007 International Conference on Service Systems and Service Management.

[9]  Slava Kisilevich,et al.  P-DBSCAN: a density based clustering algorithm for exploration and analysis of attractive areas using collections of geo-tagged photos , 2010, COM.Geo '10.

[10]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[11]  Lian Duan,et al.  A Local Density Based Spatial Clustering Algorithm with Noise , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[12]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.