Re-ranking by Multi-feature Fusion with Diffusion for Image Retrieval

We present a re-ranking algorithm for image retrieval by fusing multi-feature information. We utilize pair wise similarity scores between images to exploit the underlying relationships among images. The initial ranked list for a query from each feature is represented as an undirected graph, where edge strength comes from feature-specific image similarity. Graphs from multiple features are combined by a mixture Markov model. In addition, we utilize a probabilistic model based on the statistics of similarity scores of similar and dissimilar image pairs to determine the weight for each graph. The weight for a feature is query specific, where the ranked lists of different queries receive different weights. Our approach for calculating weights is data-driven and does not require any learning. A diffusion process is then applied to the fused graph to reduce noise and achieve better retrieval performance. Experiments demonstrate that our approach significantly improves performance over baseline methods and outperforms many state-of-the-art retrieval methods.

[1]  Hervé Jégou,et al.  Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening , 2012, ECCV.

[2]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Hugo Jair Escalante,et al.  Multimodal Markov Random Field for Image Re-ranking based on Relevance Feedback , 2013 .

[4]  Rongrong Ji,et al.  Visual Reranking through Weakly Supervised Multi-graph Learning , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Luc Van Gool,et al.  Hello neighbor: Accurate object retrieval with k-reciprocal nearest neighbors , 2011, CVPR 2011.

[7]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[8]  Luc Van Gool,et al.  Query Adaptive Similarity for Large Scale Object Retrieval , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Jiri Matas,et al.  Total recall II: Query expansion revisited , 2011, CVPR 2011.

[10]  Longin Jan Latecki,et al.  Locally constrained diffusion process on locally densified distance spaces with applications to shape retrieval , 2009, CVPR.

[11]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[12]  Jun Yu,et al.  Click Prediction for Web Image Reranking Using Multimodal Sparse Coding , 2014, IEEE Transactions on Image Processing.

[13]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[14]  Ming Yang,et al.  Query Specific Fusion for Image Retrieval , 2012, ECCV.

[15]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Edward Courtney,et al.  2 = 4 M , 1993 .

[18]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[19]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[20]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[21]  C. Schmid,et al.  On the burstiness of visual elements , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Minsu Cho,et al.  Authority-shift clustering: Hierarchical clustering by authority seeking on graphs , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Jiri Matas,et al.  Learning a Fine Vocabulary , 2010, ECCV.

[26]  Shiliang Zhang,et al.  Semantic-Aware Co-Indexing for Image Retrieval. , 2015, IEEE transactions on pattern analysis and machine intelligence.

[27]  Horst Bischof,et al.  Diffusion Processes for Retrieval Revisited , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[29]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  Andrew Zisserman,et al.  All About VLAD , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Yannis Avrithis,et al.  To Aggregate or Not to aggregate: Selective Match Kernels for Image Search , 2013, 2013 IEEE International Conference on Computer Vision.

[32]  David Harel,et al.  Clustering spatial data using random walks , 2001, KDD '01.

[33]  Christopher J. C. Burges,et al.  Spectral clustering and transductive learning with multiple views , 2007, ICML '07.

[34]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[35]  Ming Yang,et al.  Contextual weighting for vocabulary tree based image retrieval , 2011, 2011 International Conference on Computer Vision.

[36]  Frédéric Jurie,et al.  Image re-ranking based on statistics of frequent patterns , 2014, ICMR.

[37]  Vidit Jain,et al.  Learning to re-rank: query-dependent image re-ranking using click data , 2011, WWW.

[38]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[39]  Meng Wang,et al.  Multimodal Graph-Based Reranking for Web Image Search , 2012, IEEE Transactions on Image Processing.

[40]  Cordelia Schmid,et al.  Evaluation of GIST descriptors for web-scale image search , 2009, CIVR '09.