A unifying approach to hard and probabilistic clustering

We derive the clustering problem from first principles showing that the goal of achieving a probabilistic, or "hard", multi class clustering result is equivalent to the algebraic problem of a completely positive factorization under a doubly stochastic constraint. We show that spectral clustering, normalized cuts, kernel K-means and the various normalizations of the associated affinity matrix are particular instances and approximations of this general principle. We propose an efficient algorithm for achieving a completely positive factorization and extend the basic clustering scheme to situations where partial label information is available.

[1]  Gal Chechik,et al.  Extracting Relevant Structures with Side Information , 2002, NIPS.

[2]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  R. Plemmons,et al.  On reduced rank nonnegative matrix factorization for symmetric nonnegative matrices , 2004 .

[4]  Lior Wolf,et al.  Kernel Feature Selection with Side Data Using a Spectral Approach , 2004, ECCV.

[5]  Richard Sinkhorn A Relationship Between Arbitrary Positive Matrices and Doubly Stochastic Matrices , 1964 .

[6]  Yair Weiss,et al.  Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[7]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[8]  Tommi S. Jaakkola,et al.  Weighted Low-Rank Approximations , 2003, ICML.

[9]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[10]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[11]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[12]  Pietro Perona,et al.  Beyond pairwise clustering , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Tomer Hertz,et al.  Computing Gaussian Mixture Models with EM Using Equivalence Constraints , 2003, NIPS.

[14]  Abraham Berman,et al.  The maximal cp-rank of rank k completely positive matrices , 2003 .

[15]  Venu Madhav Govindu,et al.  A tensor decomposition for geometric grouping and segmentation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Michael W. Berry,et al.  Document clustering using nonnegative matrix factorization , 2006, Inf. Process. Manag..

[17]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[18]  Pietro Perona,et al.  A Factorization Approach to Grouping , 1998, ECCV.

[19]  安藤 毅 Completely positive matrices , 1991 .

[20]  M. Pavan,et al.  A new graph-theoretic approach to clustering and segmentation , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[21]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.