A Novel Density-Based Clustering Framework by Using Level Set Method

In this paper, a new density-based clustering framework is proposed by adopting the assumption that the cluster centers in data space can be regarded as target objects in image space. First, the level set evolution is adopted to find an approximation of cluster centers by using a new initial boundary formation scheme. Accordingly, three types of initial boundaries are defined so that each of them can evolve to approach the cluster centers in different ways. To avoid the long iteration time of level set evolution in data space, an efficient termination criterion is presented to stop the evolution process in the circumstance that no more cluster centers can be found. Then, a new effective density representation called level set density (LSD) is constructed from the evolution results. Finally, the valley seeking clustering is used to group data points into corresponding clusters based on the LSD. The experiments on some synthetic and real data sets have demonstrated the efficiency and effectiveness of the proposed clustering framework. The comparisons with DBSCAN method, OPTICS method, and valley seeking clustering method further show that the proposed framework can successfully avoid the overfitting phenomenon and solve the confusion problem of cluster boundary points and outliers.

[1]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[2]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[3]  Jiawei Han,et al.  CLARANS: A Method for Clustering Objects for Spatial Data Mining , 2002, IEEE Trans. Knowl. Data Eng..

[4]  Luc Vincent,et al.  Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Hans-Peter Kriegel,et al.  Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications , 1998, Data Mining and Knowledge Discovery.

[6]  Josef Kittler,et al.  A Performance Measure for Boundary Detection Algorithms , 1996, Comput. Vis. Image Underst..

[7]  Aidong Zhang,et al.  WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases , 1998, VLDB.

[8]  Philip S. Yu,et al.  Redefining Clustering for High-Dimensional Applications , 2002, IEEE Trans. Knowl. Data Eng..

[9]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[10]  J. Sethian,et al.  FRONTS PROPAGATING WITH CURVATURE DEPENDENT SPEED: ALGORITHMS BASED ON HAMILTON-JACOB1 FORMULATIONS , 2003 .

[11]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[12]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[13]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[14]  Rolf Adams,et al.  Seeded Region Growing , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  James A. Sethian,et al.  Level Set Methods and Fast Marching Methods , 1999 .

[16]  Josef Kittler,et al.  Region growing: a new approach , 1998, IEEE Trans. Image Process..

[17]  Milan Sonka,et al.  Image Processing, Analysis and Machine Vision , 1993, Springer US.

[18]  Wei-keng Liao,et al.  A Grid-based Clustering Algorithm using Adaptive Mesh Refinement , 2004 .

[19]  Michael Werman,et al.  Self-Organization in Vision: Stochastic Clustering for Image Segmentation, Perceptual Grouping, and Image Database Organization , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Hrishikesh D. Vinod Mathematica Integer Programming and the Theory of Grouping , 1969 .

[21]  Benjamin King Step-Wise Clustering Procedures , 1967 .

[22]  Guillermo Sapiro,et al.  Geodesic Active Contours , 1995, International Journal of Computer Vision.

[23]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[24]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[25]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[26]  Jiong Yang,et al.  An Approach to Active Spatial Data Mining Based on Statistical Information , 2000, IEEE Trans. Knowl. Data Eng..

[27]  Sang Uk Lee,et al.  On the color image segmentation algorithm based on the thresholding and the fuzzy c-means techniques , 1990, Pattern Recognit..

[28]  Jiong Yang,et al.  STING: A Statistical Information Grid Approach to Spatial Data Mining , 1997, VLDB.

[29]  Philip S. Yu,et al.  Finding generalized projected clusters in high dimensional spaces , 2000, SIGMOD 2000.

[30]  Baba C. Vemuri,et al.  Shape Modeling with Front Propagation: A Level Set Approach , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Y. Tsai Rapid and accurate computation of the distance function using grids , 2002 .

[32]  José Carlos Príncipe,et al.  Information Theoretic Clustering , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Nikos Paragios,et al.  Gradient Vector Flow Fast Geodesic Active Contours , 2001, ICCV.

[34]  Pierre-Yves Strub,et al.  Color image segmentation based on automatic morphological clustering , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[35]  Philip S. Yu,et al.  Finding generalized projected clusters in high dimensional spaces , 2000, SIGMOD '00.

[36]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[37]  Lutgarde M. C. Buydens,et al.  KNN-kernel density-based clustering for high-dimensional multivariate data , 2006, Comput. Stat. Data Anal..

[38]  Dinh-Tuan Pham,et al.  Image segmentation using probabilistic fuzzy c-means clustering , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[39]  Anthony K. H. Tung,et al.  CURLER: finding and visualizing nonlinear correlation clusters , 2005, SIGMOD '05.

[40]  Milan Sonka,et al.  Image processing analysis and machine vision [2nd ed.] , 1999 .

[41]  Benjamin B. Kimia,et al.  Shapes, shocks, and deformations I: The components of two-dimensional shape and the reaction-diffusion space , 1995, International Journal of Computer Vision.