Super Paramagnetic Clustering of Data (spc)

1 Summary A new clustering algorithm named Super Paramagnetic Clustering (SPC) which was published by Blatt et al. 1]]2]]3] is described. The algorithm, which was rst motivated by properties of inhomogeneous magnets, is described as an outcome of a general technique to create robust clustering algorithms. The technique uses a clustering cost function and nds the average custering solution at diierent average costs. One can usually nd several intervals of average cost values (phases) at which the average clustering solution is stable and has a unique behavior. Each of these stable solutions represent a real property of the data as it appeared in most of the clustering proposals at this average cost. All is left, is to choose one or several solutions out of the stable ones in order to represent the structure in the data. Note, that all is needed in order to use this technique is to construct a clustering cost function. The cost function should incorporate all the preferences one has regarding the clustering, therefore, all clustering proposals with the same cost should be considered as good. The SPC algorithm is a non-parametric clustering algorithm and as such it is based on local criteria. The SPC cost function penalizes assigning close points to diierent clusters. Each point x i is assigned a \color" s i = 1; : : : ; q. Any neighboring couple, < i; j >, whose colors do not agree has to pay a penalty J ij that decays with distance. The cost function is deened as