Dictionary Learning for Fast Classification Based on Soft-thresholding

Classifiers based on sparse representations have recently been shown to provide excellent results in many visual recognition and classification tasks. However, the high cost of computing sparse representations at test time is a major obstacle that limits the applicability of these methods in large-scale problems, or in scenarios where computational power is restricted. We consider in this paper a simple yet efficient alternative to sparse coding for feature extraction. We study a classification scheme that applies the soft-thresholding nonlinear mapping in a dictionary, followed by a linear classifier. A novel supervised dictionary learning algorithm tailored for this low complexity classification architecture is proposed. The dictionary learning problem, which jointly learns the dictionary and linear classifier, is cast as a difference of convex (DC) program and solved efficiently with an iterative DC solver. We conduct experiments on several datasets, and show that our learning algorithm that leverages the structure of the classification problem outperforms generic learning procedures. Our simple classifier based on soft-thresholding also competes with the recent sparse coding classifiers, when the dictionary is learned appropriately. The adopted classification scheme further requires less computational time at the testing stage, compared to other classifiers. The proposed scheme shows the potential of the adequately trained soft-thresholding mapping for classification and paves the way towards the development of very efficient classification methods for vision problems.

[1]  PerronninFlorent,et al.  Good Practice in Large-Scale Learning for Image Classification , 2014 .

[2]  Y-Lan Boureau,et al.  Learning Convolutional Feature Hierarchies for Visual Recognition , 2010, NIPS.

[3]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[4]  Chunheng Wang,et al.  Sparse representation for face recognition based on discriminative low-rank dictionary learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Erkki Oja,et al.  Reduced Multidimensional Co-Occurrence Histograms in Texture Classification , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Larry S. Davis,et al.  Learning Structured Low-Rank Representations for Image Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Geoffrey E. Hinton,et al.  On rectified linear units for speech processing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  David J. Kriegman,et al.  From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Andrew Y. Ng,et al.  The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[11]  Guillermo Sapiro,et al.  Classification and clustering via dictionary learning with structured incoherence and shared features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Julien Mairal,et al.  Complexity Analysis of the Lasso Regularization Path , 2012, ICML.

[13]  FrossardPascal,et al.  Dictionary Learning for Fast Classification Based on Soft-thresholding , 2015 .

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[16]  Mohamed-Jalal Fadili,et al.  Inpainting and Zooming Using Sparse Representations , 2009, Comput. J..

[17]  Leo Liberti,et al.  Introduction to Global Optimization , 2006 .

[18]  Pascal Frossard,et al.  Low-rate and flexible image coding with redundant representations , 2006, IEEE Transactions on Image Processing.

[19]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[21]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[23]  Jean Ponce,et al.  Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[25]  Le Thi Hoai An,et al.  The DC (Difference of Convex Functions) Programming and DCA Revisited with DC Models of Real World Nonconvex Optimization Problems , 2005, Ann. Oper. Res..

[26]  Marc'Aurelio Ranzato,et al.  Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition , 2010, ArXiv.

[27]  Yu-Chiang Frank Wang,et al.  Low-rank matrix recovery with structural incoherence for robust face recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Le Thi Hoai An,et al.  A D.C. Optimization Algorithm for Solving the Trust-Region Subproblem , 1998, SIAM J. Optim..

[29]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[30]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  David Mumford,et al.  Communications on Pure and Applied Mathematics , 1989 .

[32]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[33]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[34]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[35]  Ke Huang,et al.  Sparse Representation for Signal Classification , 2006, NIPS.

[36]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[37]  Gert R. G. Lanckriet,et al.  Sparse eigen methods by D.C. programming , 2007, ICML '07.

[38]  W. Gander,et al.  A D.C. OPTIMIZATION ALGORITHM FOR SOLVING THE TRUST-REGION SUBPROBLEM∗ , 1998 .

[39]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[40]  Razvan Pascanu,et al.  Learning Algorithms for the Classification Restricted Boltzmann Machine , 2012, J. Mach. Learn. Res..

[41]  Yann LeCun,et al.  Learning Fast Approximations of Sparse Coding , 2010, ICML.

[42]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[43]  Misha Denil,et al.  Recklessly Approximate Sparse Coding , 2012, ArXiv.

[44]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[45]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[46]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[47]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[48]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[49]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[50]  I. Daubechies,et al.  An iterative thresholding algorithm for linear inverse problems with a sparsity constraint , 2003, math/0307152.