Ask the dictionary: Soft-assignment location-orientation pooling for image classification

The pooling step is one of the key components of the well-known Bag-of-visual words (BoW) model widely used in image classification. In this paper, we propose a novel pooling method, which is called Soft-Assignment Location-Orientation Pooling (SALOP). Inspired by the bag of statistical sampling analysis (Bossa), SALOP also explores the effect of dictionary for pooling method, but leverages both location and orientation information between the local descriptors and the atoms of dictionary to aggregate feature codes. Moreover, different from existing pooling methods, SALOP employs a soft-assignment pooling scheme to handle ambiguity and uncertainty existing in the pooling process. The evaluation is conducted on two image benchmarks: Scene15 and PASCAL VOC 2007. The experimental results show our SALOP can achieve promising performances.

[1]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[2]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Lei Wang,et al.  In defense of soft-assignment coding , 2011, 2011 International Conference on Computer Vision.

[4]  Trevor Darrell,et al.  Beyond spatial pyramids: Receptive field learning for pooled image features , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Yuning Jiang,et al.  Randomized Spatial Partition for Scene Recognition , 2012, ECCV.

[8]  Matthieu Cord,et al.  BOSSA: Extended bow formalism for image classification , 2011, 2011 18th IEEE International Conference on Image Processing.

[9]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[10]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[11]  Fei-Fei Li,et al.  Object-Centric Spatial Pooling for Image Classification , 2012, ECCV.

[12]  Krystian Mikolajczyk,et al.  Comparison of mid-level feature coding approaches and pooling strategies in visual concept detection , 2013, Comput. Vis. Image Underst..

[13]  Nicolas Le Roux,et al.  Ask the locals: Multi-way local pooling for image recognition , 2011, 2011 International Conference on Computer Vision.

[14]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[15]  Giorgio Metta,et al.  Ask the Image: Supervised Pooling to Preserve Feature Locality , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Junsong Yuan,et al.  Learning weighted geometric pooling for image classification , 2013, 2013 IEEE International Conference on Image Processing.

[17]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Matthieu Cord,et al.  Pooling in image representation: The visual codeword point of view , 2013, Comput. Vis. Image Underst..

[19]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  Tieniu Tan,et al.  Feature Coding in Image Classification: A Comprehensive Study , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Frédéric Jurie,et al.  Modeling spatial layout with fisher vectors for image categorization , 2011, 2011 International Conference on Computer Vision.

[22]  Yi Yang,et al.  Weakly supervised sparse coding with geometric consistency pooling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Bingbing Ni,et al.  Geometric ℓp-norm feature pooling for image classification , 2011, CVPR 2011.