Nonconvex Sparse Spectral Clustering by Alternating Direction Method of Multipliers and Its Convergence Analysis

Spectral Clustering (SC) is a widely used data clustering method which first learns a low-dimensional embedding $U$ of data by computing the eigenvectors of the normalized Laplacian matrix, and then performs k-means on $U^\top$ to get the final clustering result. The Sparse Spectral Clustering (SSC) method extends SC with a sparse regularization on $UU^\top$ by using the block diagonal structure prior of $UU^\top$ in the ideal case. However, encouraging $UU^\top$ to be sparse leads to a heavily nonconvex problem which is challenging to solve and the work (Lu, Yan, and Lin 2016) proposes a convex relaxation in the pursuit of this aim indirectly. However, the convex relaxation generally leads to a loose approximation and the quality of the solution is not clear. This work instead considers to solve the nonconvex formulation of SSC which directly encourages $UU^\top$ to be sparse. We propose an efficient Alternating Direction Method of Multipliers (ADMM) to solve the nonconvex SSC and provide the convergence guarantee. In particular, we prove that the sequences generated by ADMM always exist a limit point and any limit point is a stationary point. Our analysis does not impose any assumptions on the iterates and thus is practical. Our proposed ADMM for nonconvex problems allows the stepsize to be increasing but upper bounded, and this makes it very efficient in practice. Experimental analysis on several real data sets verifies the effectiveness of our method.

[1]  Zhi-Quan Luo,et al.  Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems , 2014, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Zhixun Su,et al.  Linearized alternating direction method with parallel splitting and adaptive penalty for separable convex programs in machine learning , 2013, Machine Learning.

[3]  Shuicheng Yan,et al.  Convex Sparse Spectral Clustering: Single-View to Multi-View , 2015, IEEE Transactions on Image Processing.

[4]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[5]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[6]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[7]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[8]  Shuicheng Yan,et al.  Nonconvex Nonsmooth Low Rank Minimization via Iteratively Reweighted Nuclear Norm , 2015, IEEE Transactions on Image Processing.

[9]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[10]  A. Martínez,et al.  The AR face databasae , 1998 .

[11]  Shuicheng Yan,et al.  Robust and Efficient Subspace Segmentation via Least Squares Regression , 2012, ECCV.

[12]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[14]  Shuicheng Yan,et al.  Generalized Singular Value Thresholding , 2014, AAAI.

[15]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[16]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[17]  J. P. Williams,et al.  Some convexity theorems for matrices , 1971, Glasgow Mathematical Journal.

[18]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression (PIE) database , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[19]  Wotao Yin,et al.  Global Convergence of ADMM in Nonconvex Nonsmooth Optimization , 2015, Journal of Scientific Computing.

[20]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[21]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Shuicheng Yan,et al.  A Unified Alternating Direction Method of Multipliers by Majorization Minimization , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Julien Mairal,et al.  Optimization with First-Order Surrogate Functions , 2013, ICML.

[24]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[25]  Bingsheng He,et al.  On the O(1/n) Convergence Rate of the Douglas-Rachford Alternating Direction Method , 2012, SIAM J. Numer. Anal..