LLC Revisit: Scene Classification with k-Farthest Neighbours

SUMMARY This paper introduces a simple but e ff ective way to boost the performance of scene classification through a novel approach to the LLC coding process. In our proposed method, a local descriptor is encoded not only with k -nearest visual words but also with k -farthest visual words to produce more discriminative code. Since the proposed method is a simple modification of the image classification model, it can be easily integrated into various existing BoF models proposed in various areas, such as coding, pooling, to boost their scene classification performance. The results of experiments conducted with three scene datasets: 15-Scenes, MIT-Indoor67, and Sun367 show that adding k -farthest visual words better enhances scene classification performance than increasing the number of k -nearest visual words.

[1]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[2]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[3]  C. V. Jawahar,et al.  Blocks That Shout: Distinctive Parts for Scene Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Fereshteh Sadeghi,et al.  Latent Pyramidal Regions for Recognizing Scenes , 2012, ECCV.

[6]  Fei-Fei Li,et al.  Object-Centric Spatial Pooling for Image Classification , 2012, ECCV.

[7]  Lei Wang,et al.  In defense of soft-assignment coding , 2011, 2011 International Conference on Computer Vision.

[8]  Nicolas Le Roux,et al.  Ask the locals: Multi-way local pooling for image recognition , 2011, 2011 International Conference on Computer Vision.

[9]  Svetlana Lazebnik,et al.  Scene recognition and weakly supervised object localization with deformable part-based models , 2011, 2011 International Conference on Computer Vision.

[10]  Yasuo Kuniyoshi,et al.  Discriminative spatial pyramid , 2011, CVPR 2011.

[11]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[12]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[13]  Shin'ichi Satoh,et al.  Building Compact Local Pairwise Codebook with Joint Feature Space Clustering , 2010, ECCV.

[14]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Liang-Tien Chia,et al.  Local features are not lonely – Laplacian sparse coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[21]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[22]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Rong Jin,et al.  Unifying discriminative visual codebook generation with classifier training for object category recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[27]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[28]  Gabriela Csurka,et al.  Adapted Vocabularies for Generic Visual Categorization , 2006, ECCV.

[29]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30]  Thomas Serre,et al.  Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[31]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[32]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[33]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.