Image Retrieval with a Visual Thesaurus

Current state-of-art of image retrieval methods represent images as an unordered collection of local patches, each of which is classified as a "visual word" from a fixed vocabulary. This paper presents a simple but innovative way to uncover the spatial relationship between visual words so that we can discover words that represent the same latent topic and thereby improve the retrieval results. The method in this paper is borrowed from text retrieval, and is analogous to a text thesaurus in that it describes a broad set of equivalence relationship between words. We evaluate our method on the popular Oxford Building dataset. This makes it possible to compare our method with previous work on image retrieval, and the results show that our method is comparable to more complex state of the art methods.

[1]  Alexei A. Efros,et al.  Discovering object categories in image collections , 2005 .

[2]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Andrew Zisserman,et al.  Efficient Visual Search of Videos Cast as Text Retrieval , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[7]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.