Fast, accurate spectral clustering using locally linear landmarks

For problems of image or video segmentation, where clusters have a complex structure, a leading method is spectral clustering. It works by encoding the similarity between pairs of points into an affinity matrix and applying k-means in its low-order eigenspace, where the clustering structure is enhanced. When the number of points is large, an approximation is necessary to limit the runtime even if the affinity matrix is sparse. This is commonly done with the Nystrom formula, where one solves an eigenproblem using affinities between a subset of the data points (landmarks) and then estimates the eigenvectors over the entire data by interpolation. In practice, this can still require many landmarks to achieve reasonably accurate solutions, and applies only for explicitly defined affinity kernels. In this paper we propose two ideas: the Locally Linear Landmarks technique, where one solves a reduced spectral problem over landmarks that involves the entire, original affinity matrix; and a fast, good initialization for k-means. We show both approximation error and runtime are considerably reduced, even though fewer landmarks are used. We apply it to spectral clustering and to several variants of it that involve complex affinities: constrained clustering, affinity aggregation, neighborhood graphs based on tree ensembles, and video segmentation.

[1]  R. Taylor,et al.  The Numerical Treatment of Integral Equations , 1978 .

[2]  Yung-Yu Chuang,et al.  Affinity aggregation for spectral clustering , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Jitendra Malik,et al.  Contour Continuity in Region Based Image Segmentation , 1998, ECCV.

[4]  Allan D. Jepson,et al.  Hierarchical Eigensolver for Transition Matrices in Spectral Methods , 2004, NIPS.

[5]  Thomas Brox,et al.  Higher order motion models and spectral clustering , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Ameet Talwalkar,et al.  Large-scale manifold learning , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Miguel Á. Carreira-Perpiñán,et al.  Locally Linear Landmarks for Large-Scale Manifold Learning , 2013, ECML/PKDD.

[9]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[10]  Jianbo Shi,et al.  Spectral segmentation with multiscale graph decomposition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Vance Faber,et al.  Clustering and the continuous k-means algorithm , 1994 .

[12]  Matti Pietikäinen,et al.  Gray Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2000, ECCV.

[13]  Nicolas Le Roux,et al.  Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.

[14]  Geoffrey E. Hinton,et al.  Stochastic Neighbor Embedding , 2002, NIPS.

[15]  Xinlei Chen,et al.  Large Scale Spectral Clustering with Landmark-Based Representation , 2011, AAAI.

[16]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression Database , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Stella X. Yu,et al.  Progressive Multigrid Eigensolvers for Multiscale Spectral Segmentation , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Wei Liu,et al.  Large Graph Construction for Scalable Semi-Supervised Learning , 2010, ICML.

[19]  Ling Huang,et al.  Fast approximate spectral clustering , 2009, KDD.

[20]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[21]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[22]  Michael J. Brusco,et al.  Initializing K-means Batch Clustering: A Critical Evaluation of Several Techniques , 2007, J. Classif..

[23]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[25]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[27]  Achi Brandt,et al.  Efficient Multilevel Eigensolvers with Applications to Data Analysis Tasks , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[29]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[30]  Jitendra Malik,et al.  Contour and Texture Analysis for Image Segmentation , 2001, International Journal of Computer Vision.

[31]  Sang Uk Lee,et al.  Learning full pairwise affinities for spectral segmentation , 2010, CVPR.

[32]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Miguel Á. Carreira-Perpiñán,et al.  Constrained spectral clustering through affinity propagation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Gene H. Golub,et al.  Matrix computations , 1983 .

[35]  Miguel Á. Carreira-Perpiñán,et al.  Proximity Graphs for Clustering and Manifold Learning , 2004, NIPS.

[36]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Miguel Á. Carreira-Perpiñán,et al.  Entropic Affinities: Properties and Efficient Numerical Computation , 2013, ICML.

[39]  Miguel Á. Carreira-Perpiñán,et al.  The Laplacian Eigenmaps Latent Variable Model , 2007, AISTATS.

[40]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.