Clustering via Matrix Exponentiation

Given a set of n points with a matrix of pairwise similarity measures, one would like to partition the points into clusters so that similar points are together and different ones apart. We present an algorithm requiring only matrix exponentiation that performs well in practice and bears an elegant interpretation in terms of random walks on a graph. Under a certain mixture model involving planting a partition via randomized rounding of tailored matrix entries, the algorithm can be proven effective for only a single squaring. It is shown that the clustering performance of the algorithm degrades with larger values of the exponent, thus revealing that a single squaring is optimal. Thesis Supervisor: Santosh Vempala Title: Associate Professor

[1]  Santosh S. Vempala,et al.  A spectral algorithm for learning mixture models , 2004, J. Comput. Syst. Sci..

[2]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[3]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[4]  N. Alon,et al.  On the concentration of eigenvalues of random symmetric matrices , 2000, math-ph/0009032.

[5]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 1999, Random Struct. Algorithms.

[6]  Santosh S. Vempala,et al.  Latent semantic indexing: a probabilistic analysis , 1998, PODS '98.

[7]  Gene H. Golub,et al.  Matrix computations (3rd ed.) , 1996 .

[8]  Joel H. Spencer,et al.  Coloring Random and Semi-Random k-Colorable Graphs , 1995, J. Algorithms.

[9]  Mark Jerrum,et al.  Simulated annealing for graph bisection , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[10]  Ravi B. Boppana,et al.  Eigenvalues and graph bisection: An average-case analysis , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[11]  János Komlós,et al.  The eigenvalues of random symmetric matrices , 1981, Comb..

[12]  G. Stewart Introduction to matrix computations , 1973 .

[13]  Amos Fiat,et al.  Data mining through spectral analy - sis , 2001, FOCS 2001.

[14]  Alan M. Frieze,et al.  Clustering in large graphs and matrices , 1999, SODA '99.

[15]  L. Asz Random Walks on Graphs: a Survey , 2022 .