Weighted Pooling of Image Code with Saliency Map for Object Recognition

Recently, codebook-based object recognition methods have achieved the state-of-the-art performances for many public object databases. Based on the codebook-based object recognition method, we propose a novel method which uses the saliency information in the stage of pooling code vectors. By controlling each code response using the saliency value that represents the visual importance of each local area in an image, the proposed method can effectively reduce the adverse influence of low visual saliency regions, such as the background. On the basis of experiments on the public Flower102 database and Caltech object database, we confirm that the proposed method can improve the conventional codebook-based methods.

[1]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[2]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[4]  David G. Lowe,et al.  Local Naive Bayes Nearest Neighbor for image classification , 2011, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Shi-Min Hu,et al.  Global contrast based salient region detection , 2011, CVPR 2011.

[6]  S. Amari Integration of Stochastic Models by Minimizing -Divergence , 2007, Neural Computation.

[7]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[8]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Nicolas Le Roux,et al.  Ask the locals: Multi-way local pooling for image recognition , 2011, 2011 International Conference on Computer Vision.