A bag-of-objects retrieval model for web image search

Image search reranking has been an active research topic in recent years to boost the performance of the existing web image search engine which is mostly based on textual metadata of images. Various approaches have been proposed to rerank images for general queries and argue that, they may not necessarily be optimal for queries in specific domain, e.g., object queries, since the reranking algorithms are operated on whole images, instead of the relevant parts of images. In this paper, we propose a novel bag-of-objects retrieval model for image search reranking of object queries. Firstly, we employ a common object discovery algorithm to discover query-relevant objects from the search results returned by text-based image search engine. Then, the query and its result images are represented as a language model on the query relevant object vocabulary, based on which the ranking function can be derived. As the common object discovery is unreliable and may introduce noises, we propose to incorporate the attributes of the discovered objects, e.g., size, position, etc., into the ranking function through a linear model, and the weights on the object attributes can be learned. The experiments on two subsets of Web Queries dataset comprising object queries demonstrate that our approach can significantly outperform the existing reranking methods on object queries.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Andrew Blake,et al.  Cosegmentation of Image Pairs by Histogram Matching - Incorporating a Global Constraint into MRFs , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[4]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[5]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[6]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[7]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Bernt Schiele,et al.  Decomposition, discovery and detection of visual categories using topic models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Alan F. Smeaton,et al.  Interactive Experiments in Object-Based Retrieval , 2006, CIVR.

[10]  Shumeet Baluja,et al.  VisualRank: Applying PageRank to Large-Scale Image Search , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[12]  Antonio Criminisi,et al.  Harvesting Image Databases from the Web , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Takeo Kanade,et al.  Distributed cosegmentation via submodular optimization on anisotropic diffusion , 2011, 2011 International Conference on Computer Vision.

[14]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[15]  Tao Mei,et al.  Learning to video search rerank via pseudo preference feedback , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[16]  Alan Hanjalic,et al.  Supervised reranking for web image search , 2010, ACM Multimedia.

[17]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[18]  Jean Ponce,et al.  Discriminative clustering for image co-segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Antonio Torralba,et al.  Unsupervised Detection of Regions of Interest Using Iterative Link Analysis , 2009, NIPS.

[20]  Alan F. Smeaton,et al.  Using video objects and relevance feedback in video retrieval , 2005, SPIE Optics East.

[21]  Xian-Sheng Hua,et al.  Visual Reranking with Local Learning Consistency , 2010, MMM.

[22]  Yong Jae Lee,et al.  Object-graphs for context-aware category discovery , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Alan Hanjalic,et al.  Learning from search engine and human supervision for web image search , 2011, MM '11.

[24]  Shih-Fu Chang,et al.  Video search reranking through random walk over document-level context graph , 2007, ACM Multimedia.

[25]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[27]  Jian Sun,et al.  Salient object detection by composition , 2011, 2011 International Conference on Computer Vision.

[28]  Vikas Singh,et al.  An efficient algorithm for Co-segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[29]  Thomas Deselaers,et al.  Localizing Objects While Learning Their Appearance , 2010, ECCV.

[30]  Katsumi Tanaka,et al.  OVID: Design and Implementation of a Video-Object Database System , 1993, IEEE Trans. Knowl. Data Eng..

[31]  Gang Hua,et al.  Descriptive visual words and visual phrases for image applications , 2009, ACM Multimedia.

[32]  Frédéric Jurie,et al.  Improving web image search results using query-relative classifiers , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Qi Tian,et al.  Learning to judge image search results , 2011, MM '11.

[34]  Fabrice Souvannavong,et al.  Enhancing latent semantic analysis video object retrieval with structural information , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[35]  Vladimir Kolmogorov,et al.  Cosegmentation Revisited: Models and Optimization , 2010, ECCV.

[36]  Xian-Sheng Hua,et al.  Bayesian video search reranking , 2008, ACM Multimedia.

[37]  Xian-Sheng Hua,et al.  Object Retrieval Using Visual Query Context , 2011, IEEE Transactions on Multimedia.