See all by looking at a few: Sparse modeling for finding representative objects

We consider the problem of finding a few representatives for a dataset, i.e., a subset of data points that efficiently describes the entire dataset. We assume that each data point can be expressed as a linear combination of the representatives and formulate the problem of finding the representatives as a sparse multiple measurement vector problem. In our formulation, both the dictionary and the measurements are given by the data matrix, and the unknown sparse codes select the representatives via convex optimization. In general, we do not assume that the data are low-rank or distributed around cluster centers. When the data do come from a collection of low-rank models, we show that our method automatically selects a few representatives from each low-rank model. We also analyze the geometry of the representatives and discuss their relationship to the vertices of the convex hull of the data. We show that our framework can be extended to detect and reject outliers in datasets, and to efficiently deal with new observations and large datasets. The proposed framework and theoretical foundations are illustrated with examples in video summarization and image classification using representatives.

[1]  Emmanuel J. Candès,et al.  A Geometric Analysis of Subspace Clustering with Outliers , 2011, ArXiv.

[2]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[3]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[5]  Ming Gu,et al.  Efficient Algorithms for Computing a Strong Rank-Revealing QR Factorization , 1996, SIAM J. Sci. Comput..

[6]  Joel A. Tropp,et al.  ALGORITHMS FOR SIMULTANEOUS SPARSE APPROXIMATION , 2006 .

[7]  Guillermo Sapiro,et al.  Non-local sparse models for image restoration , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  W. Bajwa,et al.  Column Subset Selection with Missing Data , 2010 .

[9]  Sjsu ScholarWorks,et al.  Rank revealing QR factorizations , 2014 .

[10]  Michael Möller,et al.  A Convex Model for Nonnegative Matrix Factorization and Dimensionality Reduction on Physical Space , 2011, IEEE Transactions on Image Processing.

[11]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[13]  Richard I. Hartley,et al.  Graph connectivity in sparse subspace clustering , 2011, CVPR 2011.

[14]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[15]  René Vidal,et al.  Robust classification using structured sparse representation , 2011, CVPR 2011.

[16]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[17]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Samy Bengio,et al.  Group Sparse Coding , 2009, NIPS.

[19]  René Vidal,et al.  Sparse subspace clustering , 2009, CVPR.

[20]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[21]  David J. Kriegman,et al.  Clustering appearances of objects under varying illumination conditions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[22]  Guillermo Sapiro,et al.  Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  B. Mercier,et al.  A dual algorithm for the solution of nonlinear variational problems via finite element approximation , 1976 .

[24]  David L. Donoho,et al.  Neighborly Polytopes And Sparse Solution Of Underdetermined Linear Equations , 2005 .

[25]  Christos Boutsidis,et al.  An improved approximation algorithm for the column subset selection problem , 2008, SODA.

[26]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[27]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[28]  J. Tropp Algorithms for simultaneous sparse approximation. Part II: Convex relaxation , 2006, Signal Process..

[29]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[30]  Rémi Gribonval,et al.  Sparse approximations in signal and image processing , 2006, Signal Process..

[31]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[32]  René Vidal,et al.  A closed form solution to robust subspace estimation and clustering , 2011, CVPR 2011.

[33]  René Vidal,et al.  Sparse Manifold Clustering and Embedding , 2011, NIPS.

[34]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[35]  Ronen Basri,et al.  Lambertian Reflectance and Linear Subspaces , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Alfonso Fernández-Manso,et al.  Spectral unmixing , 2012 .

[37]  David G. Stork,et al.  Pattern Classification , 1973 .

[38]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[39]  René Vidal,et al.  Recursive identification of switched ARX systems , 2008, Autom..

[40]  Joel A. Tropp,et al.  Column subset selection, matrix factorization, and eigenvalue optimization , 2008, SODA.

[41]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[42]  René Vidal,et al.  Block-Sparse Recovery via Convex Optimization , 2011, IEEE Transactions on Signal Processing.

[43]  Guillermo Sapiro,et al.  Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.