Supervised translation-invariant sparse coding

In this paper, we propose a novel supervised hierarchical sparse coding model based on local image descriptors for classification tasks. The supervised dictionary training is performed via back-projection, by minimizing the training error of classifying the image level features, which are extracted by max pooling over the sparse codes within a spatial pyramid. Such a max pooling procedure across multiple spatial scales offer the model translation invariant properties, similar to the Convolutional Neural Network (CNN). Experiments show that our supervised dictionary improves the performance of the proposed model significantly over the unsupervised dictionary, leading to state-of-the-art performance on diverse image databases. Further more, our supervised model targets learning linear features, implying its great potential in handling large scale datasets in real applications.

[1]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[2]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[3]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[4]  Terence Sim,et al.  The CMU Pose, Illumination, and Expression (PIE) database , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[5]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[7]  D. Donoho For most large underdetermined systems of linear equations the minimal 𝓁1‐norm solution is also the sparsest solution , 2006 .

[8]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[9]  Patrick J. Flynn,et al.  Preliminary Face Recognition Grand Challenge Results , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[10]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[11]  Jiawei Han,et al.  Semi-supervised Discriminant Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[13]  Yuxiao Hu,et al.  Learning a Spatially Smooth Subspace for Face Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Yihong Gong,et al.  Deep Learning with Kernel Regularization for Visual Recognition , 2008, NIPS.

[15]  Thomas S. Huang,et al.  Image super-resolution as sparse representation of raw image patches , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  J. Andrew Bagnell,et al.  Differential Sparse Coding , 2008 .

[17]  Michael Elad,et al.  Sparse Representation for Color Image Restoration , 2008, IEEE Transactions on Image Processing.

[18]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[19]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[20]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[21]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  R. Fergus,et al.  Learning invariant features through topographic filter maps , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[24]  Zihan Zhou,et al.  Towards a practical face recognition system: Robust registration and illumination by sparse representation , 2009, CVPR.

[25]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[27]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.