Addendum to: An Approach to Hierarchical Clustering via Level Surfaces and Convexity

This article is an addendum to the 2001 paper [1] which investigated an approach to hierarchical clustering based on the level sets of a density function induced on data points in a d-dimensional feature space. We refer to this as the “level-sets approach” to hierarchical clustering. The density functions considered in [1] were those formed as the sum of identical radial basis functions centered at the data points, each radial basis function assumed to be continuous, monotone decreasing, convex on every ray, and rising to positive infinity at its center point. Such a framework can be investigated with respect to both the Euclidean (L2) and Manhattan (L1) metrics. The addendum here puts forth some observations and questions about the level-sets approach that go beyond those in [1]. In particular, we detail and ask the following questions. How does the level-sets approach compare with other related approaches? How is the resulting hierarchical clustering affected by the choice of radial basis function? What are the structural properties of a function formed as the sum of radial basis functions? Can the levels-sets approach be theoretically validated? Is there an efficient algorithm to implement the level-sets approach?

[1]  S. Kotsiantis,et al.  Recent Advances in Clustering : A Brief Survey , 2004 .

[2]  James A. Sethian,et al.  Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid , 2012 .

[3]  R. Graham,et al.  The steiner problem in phylogeny is NP-complete , 1982 .

[4]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[5]  W. H. Day Computationally difficult parsimony problems in phylogenetic systematics , 1983 .

[6]  Seth M. Malitz,et al.  An approach to hierarchical clustering via level surfaces and convexity , 2001, Discret. Comput. Geom..

[7]  Demetri Terzopoulos,et al.  Snakes: Active contour models , 2004, International Journal of Computer Vision.

[8]  Ronald L. Rivest,et al.  Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..

[9]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[10]  David S. Johnson,et al.  The Rectilinear Steiner Tree Problem is NP Complete , 1977, SIAM Journal of Applied Mathematics.

[11]  Gilbert Strang,et al.  Introduction to applied mathematics , 1988 .

[12]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[13]  Tamir Tuller,et al.  Finding a maximum likelihood tree is hard , 2006, JACM.

[14]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[15]  Michalis Vazirgiannis,et al.  On Clustering Validation Techniques , 2001, Journal of Intelligent Information Systems.

[16]  E. Cockayne On the Steiner Problem , 1967, Canadian Mathematical Bulletin.

[17]  Yen-Jen Oyang,et al.  A Study on the Hierarchical Data Clustering Algorithm Based on Gravity Theory , 2001, PKDD.

[18]  Alex M. Andrew,et al.  Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science (2nd edition) , 2000 .

[19]  S. Axler,et al.  Harmonic Function Theory , 1992 .

[20]  Yen-Jen Oyang,et al.  An Incremental Hierarchical Data Clustering Algorithm Based on Gravity Theory , 2002, PAKDD.

[21]  Charu C. Aggarwal,et al.  On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.