Subspace Learning by $$\ell ^{0}$$ℓ0-Induced Sparsity

Subspace clustering methods partition the data that lie in or close to a union of subspaces in accordance with the subspace structure. Such methods with sparsity prior, such as sparse subspace clustering (SSC) (Elhamifar and Vidal in IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781, 2013) with the sparsity induced by the $$\ell ^{1}$$ℓ1-norm, are demonstrated to be effective in subspace clustering. Most of those methods require certain assumptions, e.g. independence or disjointness, on the subspaces. However, these assumptions are not guaranteed to hold in practice and they limit the application of existing sparse subspace clustering methods. In this paper, we propose $$\ell ^{0}$$ℓ0-induced sparse subspace clustering ($$\ell ^{0}$$ℓ0-SSC). In contrast to the required assumptions, such as independence or disjointness, on subspaces for most existing sparse subspace clustering methods, we prove that $$\ell ^{0}$$ℓ0-SSC guarantees the subspace-sparse representation, a key element in subspace clustering, for arbitrary distinct underlying subspaces almost surely under the mild i.i.d. assumption on the data generation. We also present the “no free lunch” theorem which shows that obtaining the subspace representation under our general assumptions can not be much computationally cheaper than solving the corresponding $$\ell ^{0}$$ℓ0 sparse representation problem of $$\ell ^{0}$$ℓ0-SSC. A novel approximate algorithm named Approximate $$\ell ^{0}$$ℓ0-SSC (A$$\ell ^{0}$$ℓ0-SSC) is developed which employs proximal gradient descent to obtain a sub-optimal solution to the optimization problem of $$\ell ^{0}$$ℓ0-SSC with theoretical guarantee. The sub-optimal solution is used to build a sparse similarity matrix upon which spectral clustering is performed for the final clustering results. Extensive experimental results on various data sets demonstrate the superiority of A$$\ell ^{0}$$ℓ0-SSC compared to other competing clustering methods. Furthermore, we extend $$\ell ^{0}$$ℓ0-SSC to semi-supervised learning by performing label propagation on the sparse similarity matrix learnt by A$$\ell ^{0}$$ℓ0-SSC and demonstrate the effectiveness of the resultant semi-supervised learning method termed $$\ell ^{0}$$ℓ0-sparse subspace label propagation ($$\ell ^{0}$$ℓ0-SSLP).

[1]  Masayuki Karasuyama,et al.  Manifold-based Similarity Adaptation for Label Propagation , 2013, NIPS.

[2]  Prateek Jain,et al.  Robust Regression via Hard Thresholding , 2015, NIPS.

[3]  Nebojsa Jojic,et al.  -Sparse Subspace Clustering , 2016 .

[4]  Aarti Singh,et al.  Graph Connectivity in Noisy Sparse Subspace Clustering , 2015, AISTATS.

[5]  Lu Yang,et al.  Sparse representation and learning in visual recognition: Theory and applications , 2013, Signal Process..

[6]  Hans-Peter Kriegel,et al.  Subspace clustering , 2012, WIREs Data Mining Knowl. Discov..

[7]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Jun Li,et al.  Sparse Subspace Clustering by Learning Approximation ℓ0 Codes , 2017, AAAI.

[9]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[10]  Emmanuel J. Candès,et al.  Robust Subspace Clustering , 2013, ArXiv.

[11]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[12]  Emmanuel J. Candès,et al.  Decoding by linear programming , 2005, IEEE Transactions on Information Theory.

[13]  Tong Zhang,et al.  A General Theory of Concave Regularization for High-Dimensional Sparse Estimation Problems , 2011, 1108.4988.

[14]  Aswin C. Sankaranarayanan,et al.  Greedy feature selection for subspace clustering , 2013, J. Mach. Learn. Res..

[15]  Changsheng Xu,et al.  Low-Rank Sparse Coding for Image Classification , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  R. Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[17]  E. Candès The restricted isometry property and its implications for compressed sensing , 2008 .

[18]  Jiangping Wang,et al.  Data Clustering by Laplacian Regularized L1-Graph , 2014, AAAI.

[19]  Emmanuel J. Candès,et al.  A Geometric Analysis of Subspace Clustering with Outliers , 2011, ArXiv.

[20]  Zhang Yi,et al.  Robust Subspace Clustering via Thresholding Ridge Regression , 2015, AAAI.

[21]  Javier Portilla,et al.  L0-Norm-Based Sparse Representation Through Alternate Projections , 2006, 2006 International Conference on Image Processing.

[22]  Daniel P. Robinson,et al.  Scalable Sparse Subspace Clustering by Orthogonal Matching Pursuit , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[24]  Shuicheng Yan,et al.  Learning With $\ell ^{1}$-Graph for Image Analysis , 2010, IEEE Transactions on Image Processing.

[25]  Chun Chen,et al.  Graph Regularized Sparse Coding for Image Representation , 2011, IEEE Transactions on Image Processing.

[26]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[27]  Julien Mairal,et al.  Proximal Methods for Sparse Hierarchical Dictionary Learning , 2010, ICML.

[28]  Marc Teboulle,et al.  Proximal alternating linearized minimization for nonconvex and nonsmooth problems , 2013, Mathematical Programming.

[29]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[30]  Huan Xu,et al.  Provable Subspace Clustering: When LRR Meets SSC , 2013, IEEE Transactions on Information Theory.

[31]  Zhaoran Wang,et al.  High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality , 2015, NIPS.

[32]  Jiawei Han,et al.  Regularized l1-Graph for Data Clustering , 2014, BMVC.

[33]  Huan Xu,et al.  Noisy Sparse Subspace Clustering , 2013, J. Mach. Learn. Res..

[34]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[35]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[36]  Constantine Caramanis,et al.  Greedy Subspace Clustering , 2014, NIPS.

[37]  Kaushik Mahata,et al.  An approximate L0 norm minimization algorithm for compressed sensing , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[38]  René Vidal,et al.  Sparse Manifold Clustering and Embedding , 2011, NIPS.

[39]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[40]  Wei-Ying Ma,et al.  Locality preserving clustering for image database , 2004, MULTIMEDIA '04.

[41]  L. Lovász Matching Theory (North-Holland mathematics studies) , 1986 .

[42]  Shuicheng Yan,et al.  Semi-supervised Learning by Sparse Representation , 2009, SDM.