Structured Doubly Stochastic Matrix for Graph Based Clustering: Structured Doubly Stochastic Matrix

As one of the most significant machine learning topics, clustering has been extensively employed in various kinds of area. Its prevalent application in scientific research as well as industrial practice has drawn high attention in this day and age. A multitude of clustering methods have been developed, among which the graph based clustering method using the affinity matrix has been laid great emphasis on. Recent research work used the doubly stochastic matrix to normalize the input affinity matrix and enhance the graph based clustering models. Although the doubly stochastic matrix can improve the clustering performance, the clustering structure in the doubly stochastic matrix is not clear as expected. Thus, post processing step is required to extract the final clustering results, which may not be optimal. To address this problem, in this paper, we propose a novel convex model to learn the structured doubly stochastic matrix by imposing low-rank constraint on the graph Laplacian matrix. Our new structured doubly stochastic matrix can explicitly uncover the clustering structure and encode the probabilities of pair-wise data points to be connected, such that the clustering results are enhanced. An efficient optimization algorithm is derived to solve our new objective. Also, we provide theoretical discussions that when the input differs, our method possesses interesting connections with K-means and spectral graph cut models respectively. We conduct experiments on both synthetic and benchmark datasets to validate the performance of our proposed method. The empirical results demonstrate that our model provides an approach to better solving the K-mean clustering problem. By using the cluster indicator provided by our model as initialization, K-means converges to a smaller objective function value with better clustering performance. Moreover, we compare the clustering performance of our model with spectral clustering and related double stochastic model. On all datasets, our method performs equally or better than the related methods.

[1]  J. Welsh,et al.  Molecular classification of human carcinomas by use of gene expression signatures. , 2001, Cancer research.

[2]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[3]  Gerhard Weikum,et al.  Graph-based text classification: learn from your neighbors , 2006, SIGIR.

[4]  D. Bertsekas,et al.  Augmented Lagrangian and differentiable exact penalty methods , 1981 .

[5]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[6]  Feiping Nie,et al.  Cauchy Graph Embedding , 2011, ICML.

[7]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[8]  Feiping Nie,et al.  The Constrained Laplacian Rank Algorithm for Graph-Based Clustering , 2016, AAAI.

[9]  Fei Wang,et al.  Learning a Bi-Stochastic Data Similarity Matrix , 2010, 2010 IEEE International Conference on Data Mining.

[10]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[11]  M. Ringnér,et al.  Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks , 2001, Nature Medicine.

[12]  J. Neumann Functional Operators (AM-22), Volume 2: The Geometry of Orthogonal Spaces. (AM-22) , 1951 .

[13]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[14]  References , 1971 .

[15]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[16]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[17]  Feiping Nie,et al.  Forging The Graphs: A Low Rank and Positive Semidefinite Graph Learning Approach , 2012, NIPS.

[18]  B. Mohar THE LAPLACIAN SPECTRUM OF GRAPHS y , 1991 .

[19]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[20]  Asli Çelikyilmaz,et al.  A Graph-based Semi-Supervised Learning for Question-Answering , 2009, ACL.

[21]  Amnon Shashua,et al.  Doubly Stochastic Normalization for Spectral Clustering , 2006, NIPS.

[22]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[23]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[25]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[26]  Michael William Newman,et al.  The Laplacian spectrum of graphs , 2001 .

[27]  A. Martínez,et al.  The AR face databasae , 1998 .

[28]  Feiping Nie,et al.  Unsupervised and semi-supervised learning via ℓ1-norm graph , 2011, 2011 International Conference on Computer Vision.

[29]  M. J. D. Powell,et al.  A method for nonlinear constraints in minimization problems , 1969 .

[30]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[31]  Jianzhong Li,et al.  A stable gene selection in microarray data analysis , 2006, BMC Bioinformatics.

[32]  Feiping Nie,et al.  Clustering and projected clustering with adaptive neighbors , 2014, KDD.

[33]  Edward Y. Chang,et al.  Parallel Spectral Clustering in Distributed Systems , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.