Local Hybrid Coding for Image Classification

Sparse coding has received considerable research attentions due to its competitive performance for SPM-based image classification algorithms. In sparse coding, each low-level image descriptor (e.g., SIFT) is quantized into a sparse vector using an over-complete dictionary. Two typical schemes for achieving the code sparsity are imposing ℓ1-sparsity penalty on the coding coefficients, or selecting a set of fc-nearest-neighbor bases from the dictionary for locality-aware encoding. In this paper, we discover that different coding schemes usually produce substantially inconsistent coefficients, each preferring either ℓ1-sparsity or bases-locality. We therefore conjecture that different schemes should be explored simultaneously to further enhance the quantization quality. To this end, we propose a novel ensemble framework, Local Hybrid Coding (LHC), to formalize a unified optimization problem for different coding schemes. Specifically, we quantize each image descriptor using two disjoint sets of dictionaries, fcNN bases and non-fcNN bases, from which we efficiently compute a hybrid representation comprising of local coding and sparse coding, respectively. Extensive experiments on three benchmarks verify that LHC can remarkably outperform several state-of-the-art methods for image classification tasks, and bare comparable complexity to the most efficient coding methods.

[1]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[2]  Tong Zhang,et al.  Improved Local Coordinate Coding using Local Tangents , 2010, ICML.

[3]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Krishnakumar Balasubramanian,et al.  Smooth sparse coding via marginal regression for learning sparse representations , 2012, Artif. Intell..

[5]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression , 2007, J. Mach. Learn. Res..

[6]  Chun Chen,et al.  Graph Regularized Sparse Coding for Image Representation , 2011, IEEE Transactions on Image Processing.

[7]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[8]  Volker Roth,et al.  The generalized LASSO , 2004, IEEE Transactions on Neural Networks.

[9]  Andrew Y. Ng,et al.  The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[10]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Mark W. Schmidt,et al.  Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches , 2007, ECML.

[15]  Liang-Tien Chia,et al.  Local features are not lonely – Laplacian sparse coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[18]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[19]  Lei Wang,et al.  In defense of soft-assignment coding , 2011, 2011 International Conference on Computer Vision.

[20]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[21]  James Theiler,et al.  Grafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space , 2003, J. Mach. Learn. Res..

[22]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .