论文信息 - Blockwise coordinate descent schemes for efficient and effective dictionary learning

Blockwise coordinate descent schemes for efficient and effective dictionary learning

Sparse representation based dictionary learning, which is usually viewed as a method for rearranging the structure of the original data in order to make the energy compact over non-orthogonal and over-complete dictionary, is widely used in signal processing, pattern recognition, machine learning, statistics, and neuroscience. The current sparse representation framework decouples the optimization problem as two subproblems, i.e., alternate sparse coding and dictionary learning using different optimizers, treating elements in dictionary and codes separately. In this paper, we treat elements both in dictionary and codes homogenously. The original optimization is directly decoupled as several blockwise alternate subproblems rather than the above two. Hence, sparse coding and dictionary learning optimizations are unified together. More precisely, the variables involved in the optimization problem are partitioned into several suitable blocks with convexity preserved, making it possible to perform an exact blockwise coordinate descent. For each separable subproblem, based on the convexity and monotonic property of the parabolic function, a closed-form solution is obtained. The algorithm is thus simple, efficient, and effective. Experimental results show that our algorithm significantly accelerates the learning process. An application to image classification further demonstrates the efficiency of our proposed optimization strategy. HighlightsA novel simple and efficient algorithm BCDDL is proposed to solve SC-DL problems.BCDDL is the fastest algorithm in solving SC-DL to date.BCDDL shows state-of-the-art performance when learning bases with small samples.BCDDL shows state-of-the-art performance when pursuing comparatively much sparser codes.BCDDL achieves superior performance in image classification tasks.

[1] Antonio Torralba,et al. Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[2] Dieter Fox,et al. Multipath Sparse Coding Using Hierarchical Matching Pursuit , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Kjersti Engan,et al. Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4] S. M. García,et al. 2014: , 2020, A Party for Lazarus.

[5] Yanjiang Wang,et al. Blockwise coordinate descent schemes for sparse representation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[7] Thomas S. Huang,et al. Image Super-Resolution Via Sparse Representation , 2010, IEEE Transactions on Image Processing.

[8] Patrick L. Combettes,et al. Signal Recovery by Proximal Forward-Backward Splitting , 2005, Multiscale Model. Simul..

[9] TorralbaAntonio,et al. Modeling the Shape of the Scene , 2001 .

[10] Yin Zhang,et al. Fixed-Point Continuation for l1-Minimization: Methodology and Convergence , 2008, SIAM J. Optim..

[11] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[12] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[13] Yu-Jin Zhang,et al. Nonnegative Matrix Factorization: A Comprehensive Review , 2013, IEEE Transactions on Knowledge and Data Engineering.

[14] Bin Shen,et al. Learning dictionary on manifolds for image classification , 2013, Pattern Recognit..

[15] Liang-Tien Chia,et al. Laplacian Sparse Coding, Hypergraph Laplacian Sparse Coding, and Applications , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Y. C. Pati,et al. Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[17] Rajat Raina,et al. Efficient sparse coding algorithms , 2006, NIPS.

[18] Y. Censor,et al. Parallel Optimization: Theory, Algorithms, and Applications , 1997 .

[19] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[20] Fei-Fei Li,et al. What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[21] A. Bruckstein,et al. K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[22] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .

[23] Cor J. Veenman,et al. Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24] Yihong Gong,et al. Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[25] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26] Martial Hebert,et al. Self-explanatory Sparse Representation for Image Classification , 2014, ECCV.

[27] Martial Hebert,et al. Learning by Transferring from Unsupervised Universal Sources , 2016, AAAI.

[28] Stéphane Mallat,et al. Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[29] Yu-Jin Zhang,et al. Image inpainting via Weighted Sparse Non-negative Matrix Factorization , 2011, 2011 18th IEEE International Conference on Image Processing.

[30] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[32] G. Griffin,et al. Caltech-256 Object Category Dataset , 2007 .

[33] Larry S. Davis,et al. Learning a discriminative dictionary for sparse coding via label consistent K-SVD , 2011, CVPR 2011.

[34] Guillermo Sapiro,et al. Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[35] R. Tibshirani,et al. Least angle regression , 2004, math/0406456.

[36] I. Daubechies,et al. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.

[37] Y. Censor,et al. Parallel Optimization:theory , 1997 .

[38] Martial Hebert,et al. Model recommendation: Generating object detectors from few samples , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] M. Elad,et al. $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[40] Lei Zhang,et al. Multi-label sparse coding for automatic image annotation , 2009, CVPR.

[41] M. R. Osborne,et al. A new approach to variable selection in least squares problems , 2000 .

[42] Yihong Gong,et al. Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[43] Yu-Jin Zhang,et al. Neighborhood Preserving Non-negative Tensor Factorization for image representation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[44] D. Donoho. For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solution , 2006 .

[45] P. Tseng. Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[46] James M. Rehg,et al. Beyond the Euclidean distance: Creating effective visual codebooks using the Histogram Intersection Kernel , 2009, 2009 IEEE 12th International Conference on Computer Vision.