Learning Generative Models of Similarity Matrices

Recently, spectral clustering (a.k.a. normalized graph cut) techniques have become popular for their potential ability at finding irregularly-shaped clusters in data. The input to these methods is a similarity measure between every pair of data points. If the clusters are well-separated, the eigenvectors of the similarity matrix can be used to identify the clusters, essentially by identifying groups of points that are related by transitive similarity relationships. However, these techniques fail when the clusters are noisy and not well-separated, or when the scale parameter that is used to map distances between points to similarities is not set correctly. Our approach to solving these problems is to introduce a generative probability model that explicitly models noise and can be trained in a maximum-likelihood fashion to estimate the scale parameter. Exact inference is computationally intractable, but we describe tractable, approximate techniques for inference and learning. Interestingly, it turns out that greedy inference and learning in one of our models with a fixed scale parameter is equivalent to spectral clustering. We examine several data sets, and demonstrate that our method finds better clusters compared with spectral clustering.

[1]  Kun Huang,et al.  A unifying theorem for spectral embedding and clustering , 2003, AISTATS.

[2]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jianbo Shi,et al.  A Random Walks View of Spectral Segmentation , 2001, AISTATS.

[4]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Tommi S. Jaakkola,et al.  Partially labeled classification with Markov random walks , 2001, NIPS.

[6]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[7]  Naftali Tishby,et al.  Sufficient Dimensionality Reduction - A novel Analysis Method , 2002, ICML.

[8]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[9]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[10]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[11]  Yair Weiss,et al.  Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[12]  C. Lee Giles,et al.  Neural Information Processing Systems 7 , 1995 .

[13]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[14]  Pietro Perona,et al.  A Factorization Approach to Grouping , 1998, ECCV.

[15]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[16]  Jianbo Shi,et al.  Learning Segmentation by Random Walks , 2000, NIPS.