Similarity-based clustering by left-stochastic matrix factorization

For similarity-based clustering, we propose modeling the entries of a given similarity matrix as the inner products of the unknown cluster probabilities. To estimate the cluster probabilities from the given similarity matrix, we introduce a left-stochastic non-negative matrix factorization problem. A rotation-based algorithm is proposed for the matrix factorization. Conditions for unique matrix factorizations and clusterings are given, and an error bound is provided. The algorithm is particularly efficient for the case of two clusters, which motivates a hierarchical variant for cases where the number of desired clusters is large. Experiments show that the proposed left-stochastic decomposition clustering model produces relatively high within-cluster similarity on most data sets and can match given class labels, and that the efficient hierarchical variant performs surprisingly well.

[1]  Marcus Weber,et al.  Perron Cluster Analysis and Its Connection to Graph Partitioning for Noisy Data , 2004 .

[2]  Raman Arora,et al.  On Learning Rotations , 2009, NIPS.

[3]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[4]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[5]  C. Eckart,et al.  The approximation of one matrix by another of lower rank , 1936 .

[6]  Maya R. Gupta,et al.  Fusing similarities and Euclidean features with generative classifiers , 2009, 2009 12th International Conference on Information Fusion.

[7]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[8]  A. Berman,et al.  Completely Positive Matrices , 2003 .

[9]  Chris H. Q. Ding,et al.  Spectral Relaxation for K-means Clustering , 2001, NIPS.

[10]  Chris H. Q. Ding,et al.  On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering , 2005, SDM.

[11]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[12]  Maya R. Gupta,et al.  Learning kernels from indefinite similarities , 2009, ICML '09.

[13]  W. T. Williams,et al.  Dissimilarity Analysis: a new Technique of Hierarchical Sub-division , 1964, Nature.

[14]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[15]  A. Zimek,et al.  On Using Class-Labels in Evaluation of Clusterings , 2010 .

[16]  Nello Cristianini,et al.  Kernel-Based Data Fusion and Its Application to Protein Function Prediction in Yeast , 2003, Pacific Symposium on Biocomputing.

[17]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[18]  Maya R. Gupta,et al.  Clutter rejection by clustering likelihood-based similarities , 2011, 14th International Conference on Information Fusion.

[19]  P. Deuflhard,et al.  Robust Perron cluster analysis in conformation dynamics , 2005 .

[20]  Sabine Süsstrunk,et al.  Multi-spectral SIFT for scene category recognition , 2011, CVPR 2011.

[21]  P. Paatero The Multilinear Engine—A Table-Driven, Least Squares Program for Solving Multilinear Problems, Including the n-Way Parallel Factor Analysis Model , 1999 .

[22]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[23]  L. Atlas,et al.  Perceptual Feature Identification for Active Sonar Echoes , 2006, OCEANS 2006.

[24]  Raman Arora,et al.  Kernel CCA for multi-view learning of acoustic features using articulatory measurements , 2012, MLSLP.

[25]  W. Sethares,et al.  Group theoretical methods in signal processing: learning similarities, transformations and invariants , 2009 .

[26]  G. W. Stewart,et al.  Matrix Algorithms: Volume 1, Basic Decompositions , 1998 .

[27]  Maya R. Gupta,et al.  Similarity-based Classification: Concepts and Algorithms , 2009, J. Mach. Learn. Res..

[28]  Michael I. Jordan,et al.  Multiple Non-Redundant Spectral Clustering Views , 2010, ICML.

[29]  Stéphane Mallat,et al.  Classification with scattering operators , 2010, CVPR 2011.

[30]  Shokri Z. Selim,et al.  K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Raman Arora,et al.  Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[33]  Maya R. Gupta,et al.  Clustering by Left-Stochastic Matrix Factorization , 2011, ICML.

[34]  Maya R. Gupta,et al.  An EM Technique for Multiple Transmitter Localization , 2007, 2007 41st Annual Conference on Information Sciences and Systems.

[35]  Marcus Weber,et al.  Robust Perron Cluster Analysis for Various Applications in Computational Life Science , 2005, CompLife.

[36]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[37]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[38]  H. Krim,et al.  3D Face Recognition using Euclidean Integral Invariants Signature , 2007, 2007 IEEE/SP 14th Workshop on Statistical Signal Processing.

[39]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[40]  Ali Ghodsi,et al.  Nonnegative matrix factorization via rank-one downdate , 2008, ICML '08.

[41]  William A. Sethares,et al.  An Efficient and Stable Algorithm for Learning Rotations , 2010, 2010 20th International Conference on Pattern Recognition.

[42]  P. Paatero Least squares formulation of robust non-negative factor analysis , 1997 .

[43]  Zhang Xiang-su,et al.  Nonnegative Matrix Factorization:Model,Algorithms and Applications , 2013 .

[44]  C. Michelot A finite algorithm for finding the projection of a point onto the canonical simplex of ∝n , 1986 .

[45]  Nathan Srebro,et al.  Stochastic optimization for PCA and PLS , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[46]  Chris H. Q. Ding,et al.  Convex and Semi-Nonnegative Matrix Factorizations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.