A Vibration Method for Discovering Density Varied Clusters

DBSCAN is a base algorithm for density-based clustering. It can find out the clusters of different shapes and sizes from a large amount of data, which is containing noise and outliers. However, it is fail to handle the local density variation that exists within the cluster. Thus, a good clustering method should allow a significant density variation within the cluster because, if we go for homogeneous clustering, a large number of smaller unimportant clusters may be generated. In this paper, an enhancement of DBSCAN algorithm is proposed, which detects the clusters of different shapes and sizes that differ in local density. Our proposed method VMDBSCAN first finds out the “core” of each cluster—clusters generated after applying DBSCAN. Then, it “vibrates” points toward the cluster that has the maximum influence on these points. Therefore, our proposed method can find the correct number of clusters.

[1]  Chin-Chen Chang,et al.  A New Density-Based Scheme for Clustering Based on Genetic Algorithm , 2005, Fundam. Informaticae.

[2]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[3]  Robin Sibson,et al.  SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method , 1973, Comput. J..

[4]  Bassam Hammo,et al.  New Efficient Strategy to Accelerate k-Means Clustering Algorithm , 2008 .

[5]  Ashish Sharma,et al.  An Enhanced Density Based Spatial Clustering of Applications with Noise , 2009, 2009 IEEE International Advance Computing Conference.

[6]  Shardrom Johnson,et al.  A Vibrating Method Based Cluster Reducing Strategy , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[7]  Dhruba Kumar Bhattacharyya,et al.  DDSC : A Density Differentiated Spatial Clustering Technique , 2008, J. Comput..

[8]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[9]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[10]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[11]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[12]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[13]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[14]  D. Defays,et al.  An Efficient Algorithm for a Complete Link Method , 1977, Comput. J..

[15]  D. Bhattacharyya,et al.  A Clustering Technique using Density Difference , 2007, 2007 International Conference on Signal Processing, Communications and Networking.

[16]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[17]  Chenghu Zhou,et al.  A new approach to the nearest‐neighbour method to discover cluster features in overlaid spatial point processes , 2006, Int. J. Geogr. Inf. Sci..

[18]  Jianhong Wu,et al.  Data clustering - theory, algorithms, and applications , 2007 .

[19]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[20]  Filiberto Pla,et al.  Non Parametric Local Density-Based Clustering for Multimodal Overlapping Distributions , 2006, IDEAL.

[21]  Guojun Gan,et al.  Data Clustering: Theory, Algorithms, and Applications (ASA-SIAM Series on Statistics and Applied Probability) , 2007 .

[22]  Taher Niknam,et al.  A New Evolutionary Algorithm for Cluster Analysis , 2008 .

[23]  Aidong Zhang,et al.  WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases , 1998, VLDB.

[24]  M. Emre Celebi,et al.  Effective initialization of k-means for color quantization , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[25]  J. Hencil Peter,et al.  Heterogeneous Density Based Spatial Clustering of Application with Noise , 2010 .

[26]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[27]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[28]  Peng Liu,et al.  VDBSCAN: Varied Density Based Spatial Clustering of Applications with Noise , 2007, 2007 International Conference on Service Systems and Service Management.

[29]  M. Borodovsky,et al.  Recognition of genes in DNA sequence with ambiguities. , 1993, Bio Systems.

[30]  Sankar K. Pal,et al.  Fuzzy models for pattern recognition , 1992 .

[31]  Swarup Roy,et al.  An Approach to Find Embedded Clusters Using Density Based Techniques , 2005, ICDCIT.