Fully Dynamic Clustering of Metric Data Sets

The goal of cluster analysis [10] is to find homogeneous groups, or clusters, in data. Homogeneity is often made precise by means of a dissimilarity function on objects, having low values at pairs of objects in one cluster. Cluster analysis has also been investigated in data mining [5], emphasising efficiency on data sets larger than main memory [4,6,8,9,16]. More recently, the growing importance of multimedia and transactional databases has stimulated interest in metric clustering, i.e. when dissimilarity satisfies the triangular inequality.

[1]  Ray A. Jarvis,et al.  Clustering Using a Similarity Measure Based on Shared Near Neighbors , 1973, IEEE Transactions on Computers.

[2]  Hans-Peter Kriegel,et al.  Incremental Clustering for Mining in a Data Warehousing Environment , 1998, VLDB.

[3]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[4]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[5]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[6]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[7]  Peter Bro Miltersen,et al.  Complexity Models for Incremental Computation , 1994, Theor. Comput. Sci..

[8]  G. Krishna,et al.  Agglomerative clustering using the concept of mutual nearest neighbourhood , 1978, Pattern Recognit..

[9]  Sudipto Guha,et al.  ROCK: A Robust Clustering Algorithm for Categorical Attributes , 2000, Inf. Syst..

[10]  Jeffrey Scott Vitter,et al.  External memory algorithms and data structures , 1999, External Memory Algorithms.

[11]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[12]  Edward F. Grove,et al.  External-memory graph algorithms , 1995, SODA '95.

[13]  Mikkel Thorup,et al.  Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity , 1998, STOC '98.

[14]  James C. French,et al.  Clustering large datasets in arbitrary metric spaces , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).