A tutorial on spectral clustering

Abstract In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at all and what it really does. The goal of this tutorial is to give some intuition on those questions. We describe different graph Laplacians and their basic properties, present the most common spectral clustering algorithms, and derive those algorithms from scratch by several different approaches. Advantages and disadvantages of the different spectral clustering algorithms are discussed.

[1]  D. Vere-Jones Markov Chains , 1972, Nature.

[2]  M. Fiedler Algebraic connectivity of graphs , 1973 .

[3]  A. Hoffman,et al.  Lower bounds for the partitioning of graphs , 1973 .

[4]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[5]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[6]  G. Dunteman Principal Components Analysis , 1989 .

[7]  David J. Aldous,et al.  Lower bounds for covering times for reversible Markov chains and random walks on graphs , 1989 .

[8]  V. N. Bogaevski,et al.  Matrix Perturbation Theory , 1991 .

[9]  B. Mohar THE LAPLACIAN SPECTRUM OF GRAPHS y , 1991 .

[10]  Horst D. Simon,et al.  Partitioning of unstructured problems for parallel processing , 1991 .

[11]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[12]  Curt Jones,et al.  Finding Good Approximate Vertex and Edge Partitions is NP-Hard , 1992, Inf. Process. Lett..

[13]  Alex Pothen,et al.  A spectral algorithm for envelope reduction of sparse matrices , 1993, Supercomputing '93. Proceedings.

[14]  H. D. Simon,et al.  A spectral algorithm for envelope reduction of sparse matrices , 1993, Supercomputing '93. Proceedings.

[15]  J. A. López del Val,et al.  Principal Components Analysis , 2018, Applied Univariate, Bivariate, and Multivariate Statistics Using Python.

[16]  Dorothea Wagner,et al.  Between Min Cut and Graph Bisection , 1993, MFCS.

[17]  M. Randic,et al.  Resistance distance , 1993 .

[18]  Dirk Roose,et al.  An Improved Spectral Bisection Algorithm and its Application to Dynamic Load Balancing , 1995, EUROSIM International Conference.

[19]  Bruce Hendrickson,et al.  An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations , 1995, SIAM J. Sci. Comput..

[20]  Shang-Hua Teng,et al.  Spectral partitioning works: planar graphs and finite element meshes , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[21]  H. Luetkepohl The Handbook of Matrices , 1996 .

[22]  B. Mohar Some applications of Laplace eigenvalues of graphs , 1997 .

[23]  M. R. Brito,et al.  Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection , 1997 .

[24]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25]  Mechthild Stoer,et al.  A simple min-cut algorithm , 1997, JACM.

[26]  Stephen Guattery,et al.  On the Quality of Spectral Separators , 1998, SIAM J. Matrix Anal. Appl..

[27]  P. Maher,et al.  Handbook of Matrices , 1999, The Mathematical Gazette.

[28]  M. Penrose A Strong Law for the Longest Edge of the Minimal Spanning Tree , 1999 .

[29]  宮沢 政清,et al.  P. Bremaud 著, Markov Chains, (Gibbs fields, Monte Carlo simulation and Queues), Springer-Verlag, 1999年 , 2000 .

[30]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[31]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[32]  Santosh S. Vempala,et al.  On clusterings-good, bad and spectral , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[33]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[34]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[35]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.

[36]  László Lovász,et al.  Random Walks on Graphs: A Survey , 1993 .

[37]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[38]  Michael William Newman,et al.  The Laplacian spectrum of graphs , 2001 .

[39]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[40]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[41]  Isabelle Guyon,et al.  A Stability Based Method for Discovering Structure in Clustered Data , 2001, Pacific Symposium on Biocomputing.

[42]  R. Bapat,et al.  A Simple Method for Computing Resistance Distance , 2003 .

[43]  Mikhail Belkin,et al.  Problems of learning on manifolds , 2003 .

[44]  Michael I. Jordan,et al.  Learning Spectral Clustering , 2003, NIPS.

[45]  Thorsten Joachims,et al.  Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[46]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[47]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[48]  I. Dhillon,et al.  A Unified View of Kernel k-means , Spectral Clustering and Graph Cuts , 2004 .

[49]  I. Gutman,et al.  Generalized inverse of the Laplacian matrix and some applications , 2004 .

[50]  David Kempe,et al.  A decentralized algorithm for spectral analysis , 2004, STOC '04.

[51]  Ulrike von Luxburg,et al.  Limits of Spectral Clustering , 2004, NIPS.

[52]  Ulrike von Luxburg,et al.  On the Convergence of Spectral Clustering on Random Samples: The Normalized Case , 2004, COLT.

[53]  Joachim M. Buhmann,et al.  Stability-Based Validation of Clustering Solutions , 2004, Neural Computation.

[54]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[55]  François Fouss,et al.  The Principal Components Analysis of a Graph, and Its Relationships to Spectral Clustering , 2004, ECML.

[56]  Nicolas Le Roux,et al.  Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.

[57]  Chris Ding A Tutorial on Spectral Clustering , 2004, ICML 2004.

[58]  William Bialek,et al.  How Many Clusters? An Information-Theoretic Perspective , 2003, Neural Computation.

[59]  Mikhail Belkin,et al.  Towards a Theoretical Foundation for Laplacian-Based Manifold Methods , 2005, COLT.

[60]  Ronald R. Coifman,et al.  Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker-Planck Operators , 2005, NIPS.

[61]  Y. Koren,et al.  Drawing graphs by eigenvectors: theory and practice , 2005 .

[62]  Ulrike von Luxburg,et al.  From Graphs to Manifolds - Weak and Strong Pointwise Consistency of Graph Laplacians , 2005, COLT.

[63]  Kevin J. Lang Fixing two weaknesses of the Spectral Method , 2005, NIPS.

[64]  Nello Cristianini,et al.  Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problem , 2006, J. Mach. Learn. Res..

[65]  Matthias Hein,et al.  Uniform Convergence of Adaptive Graph-Based Regularization , 2006, COLT.

[66]  V. Koltchinskii,et al.  Empirical graph Laplacian approximation of Laplace–Beltrami operators: Large sample results , 2006, math/0612777.

[67]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[68]  Shai Ben-David,et al.  A Sober Look at Clustering Stability , 2006, COLT.

[69]  Marco Saerens,et al.  A novel way of computing similarities between nodes of a graph, with application to collaborative filtering and subspace projection of the graph nodes , 2006 .

[70]  Ulrike von Luxburg,et al.  Graph Laplacians and their Convergence on Random Neighborhood Graphs , 2006, J. Mach. Learn. Res..

[71]  François Fouss,et al.  Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[72]  Mikhail Belkin,et al.  Consistency of spectral clustering , 2008, 0804.0678.

[73]  Stability-based Validation of Clustering , 2009, Encyclopedia of Database Systems.

[74]  L. Asz Random Walks on Graphs: a Survey , 2022 .