Theoretical Analysis of Sparse Subspace Clustering with Missing Entries

Sparse Subspace Clustering (SSC) is a popular unsupervised machine learning method for clustering data lying close to an unknown union of low-dimensional linear subspaces; a problem with numerous applications in pattern recognition and computer vision. Even though the behavior of SSC for complete data is by now well-understood, little is known about its theoretical properties when applied to data with missing entries. In this paper we give theoretical guarantees for SSC with incomplete data, and analytically establish that projecting the zero-filled data onto the observation pattern of the point being expressed leads to a substantial improvement in performance. The main insight that stems from our analysis is that even though the projection induces additional missing entries, this is counterbalanced by the fact that the projected and zero-filled data are in effect incomplete points associated with the union of the corresponding projected subspaces, with respect to which the point being expressed is complete. The significance of this phenomenon potentially extends to the entire class of self-expressive methods.

[1]  Daniel P. Robinson,et al.  Sparse Subspace Clustering with Missing Entries , 2015, ICML.

[2]  Robert D. Nowak,et al.  Algebraic Variety Models for High-Rank Matrix Completion , 2017, ICML.

[3]  Robert D. Nowak,et al.  The Information-Theoretic Requirements of Subspace Clustering with Missing Data , 2016, ICML.

[4]  Benjamin Recht,et al.  A Simpler Approach to Matrix Completion , 2009, J. Mach. Learn. Res..

[5]  Wei-Yun Yau,et al.  Deep Subspace Clustering with Sparsity Prior , 2016, IJCAI.

[6]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[7]  D. Alonso-Gutiérrez ON THE ISOTROPY CONSTANT OF RANDOM CONVEX SETS , 2007, 0707.1570.

[8]  René Vidal,et al.  Algebraic Clustering of Affine Subspaces , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Richard I. Hartley,et al.  Graph connectivity in sparse subspace clustering , 2011, CVPR 2011.

[10]  Daniel P. Robinson,et al.  Oracle Based Active Set Algorithm for Scalable Elastic Net Subspace Clustering , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Shuicheng Yan,et al.  Robust and Efficient Subspace Segmentation via Least Squares Regression , 2012, ECCV.

[12]  Vaneet Aggarwal,et al.  On deterministic conditions for subspace clustering under missing data , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[13]  René Vidal,et al.  Geometric Conditions for Subspace-Sparse Recovery , 2015, ICML.

[14]  Peter Gritzmann,et al.  Inner and outerj-radii of convex bodies in finite-dimensional normed spaces , 1992, Discret. Comput. Geom..

[15]  S. Shankar Sastry,et al.  Generalized principal component analysis (GPCA) , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Nigel Boston,et al.  Deterministic conditions for subspace identifiability from incomplete sampling , 2014, 2015 IEEE International Symposium on Information Theory (ISIT).

[17]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[19]  R. Tibshirani The Lasso Problem and Uniqueness , 2012, 1206.0313.

[20]  Rebecca Willett,et al.  Subspace Clustering with Missing and Corrupted Data , 2017 .

[21]  Helmut Bölcskei,et al.  Robust Subspace Clustering via Thresholding , 2013, IEEE Transactions on Information Theory.

[22]  Emmanuel J. Candès,et al.  A Geometric Analysis of Subspace Clustering with Outliers , 2011, ArXiv.

[23]  René Vidal,et al.  Filtrated Algebraic Subspace Clustering , 2015, SIAM J. Imaging Sci..

[24]  René Vidal,et al.  Hyperplane Clustering via Dual Principal Component Pursuit , 2017, ICML.

[25]  Emmanuel J. Candès,et al.  Robust Subspace Clustering , 2013, ArXiv.

[26]  Ehsan Elhamifar,et al.  Sparse subspace clustering , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Huan Xu,et al.  Provable Subspace Clustering: When LRR Meets SSC , 2013, IEEE Transactions on Information Theory.

[29]  Robert D. Nowak,et al.  High-dimensional Matched Subspace Detection when data are missing , 2010, 2010 IEEE International Symposium on Information Theory.

[30]  Robert D. Nowak,et al.  High-Rank Matrix Completion , 2012, AISTATS.

[31]  Ehsan Elhamifar,et al.  High-Rank Matrix Completion and Clustering under Self-Expressive Models , 2016, NIPS.

[32]  Paul S. Bradley,et al.  k-Plane Clustering , 2000, J. Glob. Optim..

[33]  Guangliang Chen,et al.  Spectral Curvature Clustering (SCC) , 2009, International Journal of Computer Vision.

[34]  Akram Aldroubi,et al.  Similarity matrix framework for data from union of subspaces , 2017, Applied and Computational Harmonic Analysis.

[35]  S. Shankar Sastry,et al.  Generalized Principal Component Analysis , 2016, Interdisciplinary applied mathematics.

[36]  R. Vershynin Lectures in Geometric Functional Analysis , 2012 .

[37]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[38]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[39]  Huan Xu,et al.  Noisy Sparse Subspace Clustering , 2013, J. Mach. Learn. Res..