Self-explanatory Sparse Representation for Image Classification

Traditional sparse representation algorithms usually operate in a single Euclidean space. This paper leverages a self-explanatory reformulation of sparse representation, i.e., linking the learned dictionary atoms with the original feature spaces explicitly, to extend simultaneous dictionary learning and sparse coding into reproducing kernel Hilbert spaces (RKHS). The resulting single-view self-explanatory sparse representation (SSSR) is applicable to an arbitrary kernel space and has the nice property that the derivatives with respect to parameters of the coding are independent of the chosen kernel. With SSSR, multiple-view self-explanatory sparse representation (MSSR) is proposed to capture and combine various salient regions and structures from different kernel spaces. This is equivalent to learning a nonlinear structured dictionary, whose complexity is reduced by learning a set of smaller dictionary blocks via SSSR. SSSR and MSSR are then incorporated into a spatial pyramid matching framework and developed for image classification. Extensive experimental results on four benchmark datasets, including UIUC-Sports, Scene 15, Caltech-101, and Caltech-256, demonstrate the effectiveness of our proposed algorithm.

[1]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[2]  Yu-Jin Zhang,et al.  Nonnegative Matrix Factorization: A Comprehensive Review , 2013, IEEE Transactions on Knowledge and Data Engineering.

[3]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[4]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  R. Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[6]  Shuicheng Yan,et al.  Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Rama Chellappa,et al.  Kernel dictionary learning , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[9]  Bin Shen,et al.  Learning dictionary on manifolds for image classification , 2013, Pattern Recognit..

[10]  Liang-Tien Chia,et al.  Laplacian Sparse Coding, Hypergraph Laplacian Sparse Coding, and Applications , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  James M. Rehg,et al.  Beyond the Euclidean distance: Creating effective visual codebooks using the Histogram Intersection Kernel , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Thomas Villmann,et al.  Vector Quantization by Optimal Neural Gas , 1997, ICANN.

[13]  Wei Hu,et al.  Image inpainting via sparse representation , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[14]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[15]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[16]  Yu-Jin Zhang,et al.  Neighborhood Preserving Non-negative Tensor Factorization for image representation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Bin Ran,et al.  Tensor completion via a multi-linear low-n-rank factorization model , 2014, Neurocomputing.

[18]  Liang-Tien Chia,et al.  Sparse Representation With Kernels , 2013, IEEE Transactions on Image Processing.

[19]  Weifeng Liu,et al.  Self-Explanatory Convex Sparse Representation for Image Classification , 2013, 2013 IEEE International Conference on Systems, Man, and Cybernetics.

[20]  Yu-Jin Zhang,et al.  Image inpainting via Weighted Sparse Non-negative Matrix Factorization , 2011, 2011 18th IEEE International Conference on Image Processing.

[21]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[22]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Liang-Tien Chia,et al.  Kernel Sparse Representation for Image Classification and Face Recognition , 2010, ECCV.

[24]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[25]  Bin Shen,et al.  Visual Tracking via Online Nonnegative Matrix Factorization , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Yanjiang Wang,et al.  Blockwise coordinate descent schemes for sparse representation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Weifeng Liu,et al.  Multiview Hessian Regularization for Image Annotation , 2013, IEEE Transactions on Image Processing.

[28]  Yong Yu,et al.  Robust Recovery of Subspace Structures by Low-Rank Representation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Krystian Mikolajczyk,et al.  Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection , 2013, Comput. Vis. Image Underst..

[30]  Yu-Jin Zhang,et al.  Discriminant sparse coding for image classification , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[31]  Guangdong Feng,et al.  Low-n-rank tensor recovery based on multi-linear augmented Lagrange multiplier method , 2013, Neurocomputing.

[32]  Cristian Sminchisescu,et al.  Efficient Match Kernel between Sets of Features for Visual Recognition , 2009, NIPS.

[33]  Luo Si,et al.  Non-Negative Matrix Factorization Clustering on Multiple Manifolds , 2010, AAAI.

[34]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[35]  Simon C. K. Shiu,et al.  Robust Kernel Representation With Statistical Local Features for Face Recognition , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Thomas Deselaers,et al.  ClassCut for Unsupervised Class Segmentation , 2010, ECCV.

[37]  Karthikeyan Natesan Ramamurthy,et al.  Multiple Kernel Sparse Representations for Supervised and Unsupervised Learning , 2013, IEEE Transactions on Image Processing.

[38]  Andrew Zisserman,et al.  Sparse kernel approximations for efficient classification and detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[40]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  Yuan Yan Tang,et al.  Multiview Hessian discriminative sparse coding for image annotation , 2013, Comput. Vis. Image Underst..