Tag-based web photo retrieval improved by batch mode re-tagging

Web photos in social media sharing websites such as Flickr are generally accompanied by rich but noisy textual descriptions (tags, captions, categories, etc.). In this paper, we proposed a tag-based photo retrieval framework to improve the retrieval performance for Flickr photos by employing a novel batch mode re-tagging method. The proposed batch mode re-tagging method can automatically refine noisy tags of a group of Flickr photos uploaded by the same user within a short period by leveraging millions of training web images and their associated rich textual descriptions. Specifically, for one group of Flickr photos, we construct a group-specific lexicon which contains only the tags of all photos within the group. For each query tag, we employ the inverted file method to automatically find loosely labeled training web images. We propose a SVM with Augmented Features, referred to as AFSVM, to learn adapted classifiers to refine the annotation tags of photos by leveraging the existing SVM classifiers of popular tags, which are associated with a large amount of positive training web images. Moreover, to further refine the annotation tags of photos in the same group, we additionally introduce an objective function that utilizes the visual similarities of photos within the group as well as the semantic proximities of their tags. Based on the refined tags, photos can be retrieved according to more reliable relevance scores. Extensive experiments demonstrate the effectiveness of our framework.

[1]  Bernhard Schölkopf,et al.  Semiparametric Support Vector and Linear Programming Machines , 1998, NIPS.

[2]  Jiebo Luo,et al.  Large-scale multimodal semantic concept detection for consumer video , 2007, MIR '07.

[3]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[4]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[7]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[8]  James Ze Wang,et al.  Real-time computerized annotation of pictures. , 2008, IEEE transactions on pattern analysis and machine intelligence.

[9]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Tao Mei,et al.  Graph-based semi-supervised learning with multi-label , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[11]  Hsuan-Tien Lin,et al.  A note on Platt’s probabilistic outputs for support vector machines , 2007, Machine Learning.

[12]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[14]  Gang Wang,et al.  Learning image similarity from Flickr groups using Stochastic Intersection Kernel MAchines , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Changhu Wang,et al.  Content-Based Image Annotation Refinement , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Marcel Worring,et al.  Learning tag relevance by neighbor voting for social image retrieval , 2008, MIR '08.

[17]  Ivor W. Tsang,et al.  Textual Query of Personal Photos Facilitated by Large-Scale Web Data , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Jiebo Luo,et al.  Annotating photo collections by label propagation according to multiple similarity cues , 2008, ACM Multimedia.

[20]  Changhu Wang,et al.  Learning to reduce the semantic gap in web image retrieval and annotation , 2008, SIGIR '08.

[21]  Wei-Ying Ma,et al.  Annotating Images by Mining Image Search Results , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[23]  David Grangier,et al.  A Discriminative Kernel-based Model to Rank Images from Text Queries , 2007 .

[24]  Shih-Fu Chang,et al.  To search or to label?: predicting the performance of search-based automatic image classifiers , 2006, MIR '06.