论文信息 - Self-Expressive Decompositions for Matrix Approximation and Clustering

Self-Expressive Decompositions for Matrix Approximation and Clustering

Data-aware methods for dimensionality reduction and matrix decomposition aim to find low-dimensional structure in a collection of data. Classical approaches discover such structure by learning a basis that can efficiently express the collection. Recently, "self expression", the idea of using a small subset of data vectors to represent the full collection, has been developed as an alternative to learning. Here, we introduce a scalable method for computing sparse SElf-Expressive Decompositions (SEED). SEED is a greedy method that constructs a basis by sequentially selecting incoherent vectors from the dataset. After forming a basis from a subset of vectors in the dataset, SEED then computes a sparse representation of the dataset with respect to this basis. We develop sufficient conditions under which SEED exactly represents low rank matrices and vectors sampled from a unions of independent subspaces. We show how SEED can be used in applications ranging from matrix approximation and denoising to clustering, and apply it to numerous real-world datasets. Our results demonstrate that SEED is an attractive low-complexity alternative to other sparse matrix factorization approaches such as sparse PCA and self-expressive methods for clustering.

Richard G. Baraniuk | Tom Goldstein | Konrad P. Körding | Eva L. Dyer | Raajen Patel

[1] David J. Kriegman,et al. From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2] Zhifeng Zhang,et al. Adaptive time-frequency decompositions , 1994 .

[3] Petros Drineas,et al. On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[4] Yurii Nesterov,et al. Generalized Power Method for Sparse Principal Component Analysis , 2008, J. Mach. Learn. Res..

[5] Guillermo Sapiro,et al. Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[6] René Vidal,et al. Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[7] Santosh S. Vempala,et al. Matrix approximation and projective clustering via volume sampling , 2006, SODA '06.

[8] R. Tibshirani,et al. Sparse Principal Component Analysis , 2006 .

[9] Guillermo Sapiro,et al. See all by looking at a few: Sparse modeling for finding representative objects , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Harris Drucker,et al. Comparison of learning algorithms for handwritten digit recognition , 1995 .

[11] A. Bruckstein,et al. K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[12] Emmanuel J. Candès,et al. A Geometric Analysis of Subspace Clustering with Outliers , 2011, ArXiv.

[13] M. Elad,et al. $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[14] Richard G. Baraniuk,et al. oASIS: Adaptive Column Sampling for Kernel Matrix Approximation , 2015, ArXiv.

[15] Michael A. Saunders,et al. Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[16] René Vidal,et al. Sparse subspace clustering , 2009, CVPR.

[17] Huan Xu,et al. Noisy Sparse Subspace Clustering , 2013, J. Mach. Learn. Res..

[18] Aswin C. Sankaranarayanan,et al. Greedy feature selection for subspace clustering , 2013, J. Mach. Learn. Res..

[19] Matthias W. Seeger,et al. Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[20] Inderjit S. Dhillon,et al. Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[21] Marc W Slutzky,et al. Statistical assessment of the stability of neural movement representations. , 2011, Journal of neurophysiology.

[22] Jitendra Malik,et al. Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23] Ahmed H. Tewfik,et al. Learning Sparse Representation Using Iterative Subspace Identification , 2010, IEEE Transactions on Signal Processing.

[24] Marian-Daniel Iordache,et al. Greedy algorithms for pure pixels identification in hyperspectral unmixing: A multiple-measurement vector viewpoint , 2013, 21st European Signal Processing Conference (EUSIPCO 2013).

[25] Richard I. Hartley,et al. Graph connectivity in sparse subspace clustering , 2011, CVPR 2011.

[26] Yong Yu,et al. Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[27] Guillermo Sapiro,et al. Finding Exemplars from Pairwise Dissimilarities via Simultaneous Sparse Recovery , 2012, NIPS.

[28] Richard G. Baraniuk,et al. Subspace clustering with dense representations , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[29] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[30] Petros Drineas,et al. CUR matrix decompositions for improved data analysis , 2009, Proceedings of the National Academy of Sciences.

[31] David P. Woodruff,et al. Fast approximation of matrix coherence and statistical leverage , 2011, ICML.

[32] Michael Elad,et al. Efficient Implementation of the K-SVD Algorithm using Batch Orthogonal Matching Pursuit , 2008 .