Clustering Consistent Sparse Subspace Clustering

Subspace clustering is the problem of clustering data points into a union of low-dimensional linear or affine subspaces. It is the mathematical abstraction of many important problems in computer vision, image processing and has been drawing avid attention in machine learning and statistics recently. In particular, a line of recent work (Elhamifar and Vidal, 2013; Soltanolkotabi et al., 2012; Wang and Xu, 2013; Soltanolkotabi et al., 2014) provided strong theoretical guarantee for the seminal algorithm: Sparse Subspace Clustering (SSC) (Elhamifar and Vidal, 2013) under various settings, and to some extent, justified its state-of-the-art performance in applications such as motion segmentation and face clustering. The focus of these work has been getting milder conditions under which SSC obeys "self-expressiveness property", which ensures that no two points from different subspaces can be clustered together. Such guarantee however is not sufficient for the clustering to be correct, thanks to the notorious "graph connectivity problem" (Nasihatkon and Hartley, 2011). In this paper, we show that this issue can be resolved by a very simple post-processing procedure under only a mild "general position" assumption. In addition, we show that the approach is robust to arbitrary bounded perturbation of the data whenever the "general position" assumption holds with a margin. These results provide the first exact clustering guarantee of SSC for subspaces of dimension greater than 3.

[1]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[3]  Ryan J. Tibshirani,et al.  Degrees of freedom and model search , 2014, 1402.1920.

[4]  Giovanni Montana,et al.  Subspace clustering of high-dimensional data: a predictive approach , 2012, Data Mining and Knowledge Discovery.

[5]  Richard I. Hartley,et al.  Graph connectivity in sparse subspace clustering , 2011, CVPR 2011.

[6]  S. Geer,et al.  Oracle Inequalities and Optimal Inference under Group Sparsity , 2010, 1007.1771.

[7]  Kun Huang,et al.  Multiscale Hybrid Linear Models for Lossy Image Representation , 2006, IEEE Transactions on Image Processing.

[8]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[9]  René Vidal,et al.  Motion Segmentation with Missing Data Using PowerFactorization and GPCA , 2004, CVPR.

[10]  P. Bickel,et al.  SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR , 2008, 0801.1095.

[11]  Yin Chen,et al.  Fused sparsity and robust estimation for linear models with unknown variance , 2012, NIPS.

[12]  David J. Kriegman,et al.  Clustering appearances of objects under varying illumination conditions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[13]  Emmanuel J. Candès,et al.  Robust Subspace Clustering , 2013, ArXiv.

[14]  Emmanuel J. Candès,et al.  A Geometric Analysis of Subspace Clustering with Outliers , 2011, ArXiv.

[15]  Yudong Chen,et al.  Clustering Partially Observed Graphs via Convex Optimization , 2011, ICML.

[16]  Marc Pollefeys,et al.  A General Framework for Motion Segmentation: Independent, Articulated, Rigid, Non-rigid, Degenerate and Non-degenerate , 2006, ECCV.

[17]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[18]  Helmut Bölcskei,et al.  Robust Subspace Clustering via Thresholding , 2013, IEEE Transactions on Information Theory.

[19]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[20]  Stratis Ioannidis,et al.  Guess Who Rated This Movie: Identifying Users Through Subspace Clustering , 2012, UAI.

[21]  Helmut Bölcskei,et al.  Subspace clustering via thresholding and spectral clustering , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Huan Xu,et al.  Noisy Sparse Subspace Clustering , 2013, J. Mach. Learn. Res..

[23]  Paul S. Bradley,et al.  k-Plane Clustering , 2000, J. Glob. Optim..

[24]  Guangliang Chen,et al.  Spectral Curvature Clustering (SCC) , 2009, International Journal of Computer Vision.

[25]  P. Tseng Nearest q-Flat to m Points , 2000 .

[26]  V. N. Bogaevski,et al.  Matrix Perturbation Theory , 1991 .

[27]  Samir Khuller,et al.  Graph Connectivity , 2016, Encyclopedia of Algorithms.

[28]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[29]  Robert D. Nowak,et al.  High-Rank Matrix Completion , 2012, AISTATS.

[30]  Constantine Caramanis,et al.  Greedy Subspace Clustering , 2014, NIPS.

[31]  Martin J. Wainwright,et al.  Restricted Eigenvalue Properties for Correlated Gaussian Designs , 2010, J. Mach. Learn. Res..

[32]  Hans-Peter Kriegel,et al.  Subspace clustering , 2012, WIREs Data Mining Knowl. Discov..