Learning with $\ell^{0}$-Graph: $\ell^{0}$-Induced Sparse Subspace Clustering

Sparse subspace clustering methods, such as Sparse Subspace Clustering (SSC) \cite{ElhamifarV13} and $\ell^{1}$-graph \cite{YanW09,ChengYYFH10}, are effective in partitioning the data that lie in a union of subspaces. Most of those methods use $\ell^{1}$-norm or $\ell^{2}$-norm with thresholding to impose the sparsity of the constructed sparse similarity graph, and certain assumptions, e.g. independence or disjointness, on the subspaces are required to obtain the subspace-sparse representation, which is the key to their success. Such assumptions are not guaranteed to hold in practice and they limit the application of sparse subspace clustering on subspaces with general location. In this paper, we propose a new sparse subspace clustering method named $\ell^{0}$-graph. In contrast to the required assumptions on subspaces for most existing sparse subspace clustering methods, it is proved that subspace-sparse representation can be obtained by $\ell^{0}$-graph for arbitrary distinct underlying subspaces almost surely under the mild i.i.d. assumption on the data generation. We develop a proximal method to obtain the sub-optimal solution to the optimization problem of $\ell^{0}$-graph with proved guarantee of convergence. Moreover, we propose a regularized $\ell^{0}$-graph that encourages nearby data to have similar neighbors so that the similarity graph is more aligned within each cluster and the graph connectivity issue is alleviated. Extensive experimental results on various data sets demonstrate the superiority of $\ell^{0}$-graph compared to other competing clustering methods, as well as the effectiveness of regularized $\ell^{0}$-graph.

[1]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.

[2]  Kaushik Mahata,et al.  An approximate L0 norm minimization algorithm for compressed sensing , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[4]  Marc Teboulle,et al.  Proximal alternating linearized method for nonconvex and nonsmooth problems , 2014 .

[5]  Shuicheng Yan,et al.  Learning With $\ell ^{1}$-Graph for Image Analysis , 2010, IEEE Transactions on Image Processing.

[6]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[7]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[8]  Hans-Peter Kriegel,et al.  Subspace clustering , 2012, WIREs Data Mining Knowl. Discov..

[9]  Chun Chen,et al.  Graph Regularized Sparse Coding for Image Representation , 2011, IEEE Transactions on Image Processing.

[10]  Constantine Caramanis,et al.  Greedy Subspace Clustering , 2014, NIPS.

[11]  L. Lovász Matching Theory (North-Holland mathematics studies) , 1986 .

[12]  Shuicheng Yan,et al.  Semi-supervised Learning by Sparse Representation , 2009, SDM.

[13]  A. Raftery,et al.  Model-Based Clustering , Discriminant Analysis , and Density Estimation 1 , 2002 .

[14]  R. Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[15]  Jiawei Han,et al.  Regularized l1-Graph for Data Clustering , 2014, BMVC.

[16]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[17]  Emmanuel J. Candès,et al.  A Geometric Analysis of Subspace Clustering with Outliers , 2011, ArXiv.

[18]  Zhang Yi,et al.  Robust Subspace Clustering via Thresholding Ridge Regression , 2015, AAAI.

[19]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[20]  Zuowei Shen,et al.  L0 Norm Based Dictionary Learning by Proximal Methods with Global Convergence , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[22]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[23]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[24]  Aswin C. Sankaranarayanan,et al.  Greedy feature selection for subspace clustering , 2013, J. Mach. Learn. Res..

[25]  Julien Mairal,et al.  Proximal Methods for Sparse Hierarchical Dictionary Learning , 2010, ICML.

[26]  Marc Teboulle,et al.  Proximal alternating linearized minimization for nonconvex and nonsmooth problems , 2013, Mathematical Programming.

[27]  Changsheng Xu,et al.  Low-Rank Sparse Coding for Image Classification , 2013, 2013 IEEE International Conference on Computer Vision.

[28]  Lu Yang,et al.  Sparse representation and learning in visual recognition: Theory and applications , 2013, Signal Process..

[29]  René Vidal,et al.  Sparse Manifold Clustering and Embedding , 2011, NIPS.

[30]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[31]  Wei-Ying Ma,et al.  Locality preserving clustering for image database , 2004, MULTIMEDIA '04.

[32]  Javier Portilla,et al.  L0-Norm-Based Sparse Representation Through Alternate Projections , 2006, 2006 International Conference on Image Processing.

[33]  Proximal Methods for Sparse Hierarchical Dictionary Learning: Supplementary Materials , 2010 .

[34]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[36]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Richard I. Hartley,et al.  Graph connectivity in sparse subspace clustering , 2011, CVPR 2011.