A Clustering Approach to Learning Sparsely Used Overcomplete Dictionaries

We consider the problem of learning overcomplete dictionaries in the context of sparse coding, where each sample selects a sparse subset of dictionary elements. Our main result is a strategy to approximately recover the unknown dictionary using an efficient algorithm. Our algorithm is a clustering-style procedure, where each cluster is used to estimate a dictionary element. The resulting solution can often be further cleaned up to obtain a high accuracy estimate, and we provide one simple scenario where <inline-formula> <tex-math notation="LaTeX">$\ell _{1}$ </tex-math></inline-formula>-regularized regression can be used for such a second stage.

[1]  Anima Anandkumar,et al.  Analyzing Tensor Power Method Dynamics: Applications to Learning Overcomplete Latent Variable Models , 2014, ArXiv.

[2]  Aditya Bhaskara,et al.  Provable Bounds for Learning Some Deep Representations , 2013, ICML.

[3]  Yudong Chen,et al.  Clustering Partially Observed Graphs via Convex Optimization , 2011, ICML.

[4]  F. Sommer,et al.  Ramsey theory reveals the conditions when sparse coding on subsampled data is unique , 2011 .

[5]  Alexander G. Gray,et al.  Sparsity-Based Generalization Bounds for Predictive Sparse Coding , 2013, ICML.

[6]  Julien Mairal,et al.  Proximal Methods for Sparse Hierarchical Dictionary Learning , 2010, ICML.

[7]  Huan Wang,et al.  Exact Recovery of Sparsely-Used Dictionaries , 2012, COLT.

[8]  Sanjeev Arora,et al.  Simple, Efficient, and Neural Algorithms for Sparse Coding , 2015, COLT.

[9]  P. Bahr,et al.  Sampling: Theory and Applications , 2020, Applied and Numerical Harmonic Analysis.

[10]  Sanjeev Arora,et al.  New Algorithms for Learning Incoherent and Overcomplete Dictionaries , 2013, COLT.

[11]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[12]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.

[13]  John Wright,et al.  Complete dictionary recovery over the sphere , 2015, 2015 International Conference on Sampling Theory and Applications (SampTA).

[14]  Anima Anandkumar,et al.  Supplementary Material for the AISTATS 2016 Paper: Provable Tensor Methods for Learning Mixtures of Generalized Linear Models , 2016 .

[15]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[16]  Adel Javanmard,et al.  Learning Linear Bayesian Networks with Latent Variables , 2012, ICML.

[17]  Rémi Gribonval,et al.  Sample Complexity of Dictionary Learning and Other Matrix Factorizations , 2013, IEEE Transactions on Information Theory.

[18]  Anima Anandkumar,et al.  A Tensor Spectral Approach to Learning Mixed Membership Community Models , 2013, COLT.

[19]  John Wright,et al.  Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.

[20]  Furong Huang,et al.  Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[21]  Lieven De Lathauwer,et al.  Fourth-Order Cumulant-Based Blind Identification of Underdetermined Mixtures , 2007, IEEE Transactions on Signal Processing.

[22]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[23]  Joel A. Tropp,et al.  Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit , 2007, IEEE Transactions on Information Theory.

[24]  David Steurer,et al.  Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method , 2014, STOC.

[25]  Alexander Cloninger,et al.  Provable approximation properties for deep neural networks , 2015, ArXiv.

[26]  Anima Anandkumar,et al.  A Spectral Algorithm for Latent Dirichlet Allocation , 2012, Algorithmica.

[27]  Friedrich T. Sommer,et al.  When Can Dictionary Learning Uniquely Recover Sparse Data From Subsamples? , 2011, IEEE Transactions on Information Theory.

[28]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Sanjeev Arora,et al.  A Practical Algorithm for Topic Modeling with Provable Guarantees , 2012, ICML.

[30]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[31]  Prateek Jain,et al.  Learning Sparsely Used Overcomplete Dictionaries via Alternating Minimization , 2013, SIAM J. Optim..

[32]  Jaroslaw Blasiok,et al.  An improved analysis of the ER-SpUD dictionary learning algorithm , 2016, ICALP.

[33]  Anima Anandkumar,et al.  Beyond LDA: A Unified Framework for Learning Latent Normalized Infinitely Divisible Topic Models through Spectral Methods , 2016, ArXiv.

[34]  Rémi Gribonval,et al.  Sparse and Spurious: Dictionary Learning With Noise and Outliers , 2014, IEEE Transactions on Information Theory.

[35]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[36]  Sanjeev Arora,et al.  Finding overlapping communities in social networks: toward a rigorous approach , 2011, EC '12.

[37]  Anima Anandkumar,et al.  When are overcomplete topic models identifiable? uniqueness of tensor tucker decompositions with structured sparsity , 2013, J. Mach. Learn. Res..

[38]  Pascal Vincent,et al.  Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives , 2012, ArXiv.

[39]  E. Candès The restricted isometry property and its implications for compressed sensing , 2008 .

[40]  John Wright,et al.  Finding a Sparse Vector in a Subspace: Linear Sparsity Using Alternating Directions , 2014, IEEE Transactions on Information Theory.

[41]  Anima Anandkumar,et al.  Tensor Decompositions for Learning Latent Variable Models (A Survey for ALT) , 2015, ALT.

[42]  Anima Anandkumar,et al.  Provable Tensor Methods for Learning Mixtures of Generalized Linear Models , 2014, AISTATS.

[43]  Sanjeev Arora,et al.  Why are deep nets reversible: A simple theory, with implications for training , 2015, ArXiv.

[44]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[45]  Nadav Cohen,et al.  On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.

[46]  Adel Javanmard,et al.  Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints , 2012, 1209.5350.

[47]  Mike E. Davies,et al.  Iterative Hard Thresholding for Compressed Sensing , 2008, ArXiv.

[48]  Sanjeev Arora,et al.  Provable ICA with Unknown Gaussian Noise, and Implications for Gaussian Mixtures and Autoencoders , 2012, Algorithmica.

[49]  Huan Wang,et al.  On the local correctness of ℓ1-minimization for dictionary learning , 2011, 2014 IEEE International Symposium on Information Theory.

[50]  Mark Braverman,et al.  I Like Her more than You: Self-determined Communities , 2012, ArXiv.

[51]  Massimiliano Pontil,et al.  Sparse coding for multitask and transfer learning , 2012, ICML.

[52]  Rahul Garg,et al.  Gradient descent with sparsification: an iterative algorithm for sparse recovery with restricted isometry property , 2009, ICML '09.

[53]  Rémi Gribonval,et al.  Local stability and robustness of sparse dictionary learning in the presence of noise , 2012, ArXiv.

[54]  Anima Anandkumar,et al.  Learning Mixed Membership Community Models in Social Tagging Networks through Tensor Methods , 2015, ArXiv.

[55]  Anima Anandkumar,et al.  Learning Overcomplete Latent Variable Models through Tensor Methods , 2014, COLT.

[56]  Santosh S. Vempala,et al.  Fourier PCA , 2013, ArXiv.

[57]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[58]  Karthikeyan Natesan Ramamurthy,et al.  Learning Stable Multilevel Dictionaries for Sparse Representation of Images , 2013, ArXiv.

[59]  Shie Mannor,et al.  The Sample Complexity of Dictionary Learning , 2010, COLT.

[60]  Anima Anandkumar,et al.  A tensor approach to learning mixed membership community models , 2013, J. Mach. Learn. Res..

[61]  Karthikeyan Natesan Ramamurthy,et al.  Learning Stable Multilevel Dictionaries for Sparse Representations , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[62]  Aditya Bhaskara,et al.  More Algorithms for Provable Dictionary Learning , 2014, ArXiv.

[63]  John Wright,et al.  Complete Dictionary Recovery Over the Sphere II: Recovery by Riemannian Trust-Region Method , 2015, IEEE Transactions on Information Theory.

[64]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.