On Clusters that are Separated but Large

Given a set P of n points in Rd, consider the problem of computing k subsets of P that form clusters that are well-separated from each other, and each of them is large (cardinality wise). We provide tight upper and lower bounds, and corresponding algorithms, on the quality of separation, and the size of the clusters that can be computed, as a function of n, d,k,s, and Φ, where s is the desired separation, and Φ is the spread of the point set P .

[1]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[2]  S. Rao Kosaraju,et al.  A decomposition of multidimensional point sets with applications to k-nearest-neighbors and n-body potential fields , 1995, JACM.

[3]  Ulrike von Luxburg,et al.  Clustering Stability: An Overview , 2010, Found. Trends Mach. Learn..

[4]  M. Mead,et al.  Cybernetics , 1953, The Yale Journal of Biology and Medicine.

[5]  Nathan Linial,et al.  On metric ramsey-type phenomena , 2003, STOC '03.

[6]  M. J. Katz,et al.  Geographic Quorum Systems Approximations , 2014 .

[7]  Sariel Har-Peled,et al.  New constructions of SSPDs and their applications , 2010, Comput. Geom..

[8]  Sariel Har-Peled,et al.  Down the Rabbit Hole: Robust Proximity Search in Sublinear Space , 2011, ArXiv.

[9]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[10]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[11]  Sariel Har-Peled,et al.  Net and Prune , 2014, J. ACM.

[12]  P. Sopp Cluster analysis. , 1996, Veterinary immunology and immunopathology.

[13]  Kasturi R. Varadarajan A divide-and-conquer algorithm for min-cost perfect matching in the plane , 1998, Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280).