Codebook optimization using word activation forces for scene categorization

Visual codebook based quantization of robust appearance descriptors extracted from local image patches is an effective means of capturing image statistics for texture analysis and natural scene classification. In this paper, based on the newly proposed statistics of word activation forces (WAFs), we optimize the codebook. Currently, codebooks are typically created from a set of training images using a clustering algorithm. However, these codebooks are often functionally limited due to redundancy. We show that WAFs can remove the redundancy efficiently. In the experiment, the proposed method achieved the state-of-the-art performance on the Caltech-101, fifteen natural scene categories and VOC2007 databases. The optimization method also offers insights into the success of several recently proposed images classification approaches, including vector quantization (VQ) coding in the Spatial Pyramid Matching (SPM), sparse coding SPM (ScSPM), and Locality-constrained Linear Coding (LLC).

[1]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[2]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[3]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[4]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[7]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[8]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[10]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Jun Guo,et al.  An Activation Force-based Affinity Measure for Analyzing Complex Networks , 2011, Scientific reports.

[12]  Zhen Li,et al.  Hierarchical Gaussianization for image classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.