An automatic shape independent clustering technique

This article describes a clustering technique that can automatically detect any number of well-separated clusters which may be of any shape, convex and/or non-convex. This is in contrast to most other techniques which assume a value for the number of clusters and/or a particular cluster structure. The proposed technique is based on an iterative partitioning of the relative neighborhood graph, coupled with a post-processing step for merging small clusters. Techniques for improving the efficiency of the proposed scheme are implemented. The clustering scheme is able to detect outliers in data. It is also able to indicate the inherent hierarchical nature of the clusters present in a data set. Moreover, the proposed technique is also able to identify the situation when the data do not have any natural clusters at all. Results demonstrating the effectiveness of the clustering scheme are provided for several data sets.

[1]  James C. Bezdek,et al.  Validity-guided (re)clustering with applications to image segmentation , 1996, IEEE Trans. Fuzzy Syst..

[2]  Julius T. Tou,et al.  Pattern Recognition Principles , 1974 .

[3]  Gordon C. Osbourn,et al.  Empirically defined regions of influence for clustering analyses , 1995, Pattern Recognit..

[4]  Charles T. Zahn,et al.  Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters , 1971, IEEE Transactions on Computers.

[5]  Mu-Chun Su,et al.  A novel algorithm for data clustering , 2001, Pattern Recognit..

[6]  Roderick Urquhart,et al.  Graph theoretical clustering based on limited neighbourhood sets , 1982, Pattern Recognit..

[7]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[8]  Sanjay Ranka,et al.  An effic ient k-means clustering algorithm , 1997 .

[9]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[10]  Godfried T. Toussaint,et al.  The relative neighbourhood graph of a finite planar set , 1980, Pattern Recognit..

[11]  Pierre Hansen,et al.  J-MEANS: a new local search heuristic for minimum sum of squares clustering , 1999, Pattern Recognit..

[12]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[13]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[14]  Paul S. Bradley,et al.  Scaling Clustering Algorithms to Large Databases , 1998, KDD.

[15]  Brian Everitt,et al.  Cluster analysis , 1974 .

[16]  Ron Shamir,et al.  A clustering algorithm based on graph connectivity , 2000, Inf. Process. Lett..