Learning Social Tag Relevance by Neighbor Voting

Social image analysis and retrieval is important for helping people organize and access the increasing amount of user tagged multimedia. Since user tagging is known to be uncontrolled, ambiguous, and overly personalized, a fundamental problem is how to interpret the relevance of a user-contributed tag with respect to the visual content the tag is describing. Intuitively, if different persons label visually similar images using the same tags, these tags are likely to reflect objective aspects of the visual content. Starting from this intuition, we propose in this paper a neighbor voting algorithm which accurately and efficiently learns tag relevance by accumulating votes from visual neighbors. Under a set of well-defined and realistic assumptions, we prove that our algorithm is a good tag relevance measurement for both image ranking and tag ranking. Three experiments on 3.5 million Flickr photos demonstrate the general applicability of our algorithm in both social image retrieval and image tag suggestion. Our tag relevance learning algorithm substantially improves upon baselines for all the experiments. The results suggest that the proposed algorithm is promising for real-world applications.

[1]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[2]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[3]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[5]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 1 , 2000, Inf. Process. Manag..

[7]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 2 , 2000, Inf. Process. Manag..

[8]  Rong Yan,et al.  Multimedia Search with Pseudo-relevance Feedback , 2003, CIVR.

[9]  Rong Jin,et al.  Web image retrieval re-ranking with relevance model , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[10]  Fiona Fui-Hoon Nah,et al.  A study on tolerable waiting time: how long are Web users willing to wait? , 2004, AMCIS.

[11]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[12]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[13]  Raimondo Schettini,et al.  Image annotation using SVM , 2003, IS&T/SPIE Electronic Imaging.

[14]  Heung-Kyu Lee,et al.  Majority Based Ranking Approach in Web Image Retrieval , 2003, CIVR.

[15]  Justin Zobel,et al.  Questioning Query Expansion: An Examination of Behaviour and Parameters , 2004, ADC.

[16]  Pietro Perona,et al.  A Visual Category Filter for Google Images , 2004, ECCV.

[17]  Gang Wang,et al.  TRECVID 2004 Search and Feature Extraction Task by NUS PRIS , 2004, TRECVID.

[18]  Latifur Khan,et al.  Image annotations by combining multiple evidence & wordNet , 2005, ACM Multimedia.

[19]  Grigory Begelman,et al.  Automated Tag Clustering: Improving search and exploration in the tag space , 2006 .

[20]  Shih-Fu Chang,et al.  Video search reranking via information bottleneck principle , 2006, MM '06.

[21]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[22]  Milind R. Naphade,et al.  Semantic Multimedia Retrieval using Lexical Query Expansion and Model-Based Reranking , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[23]  Krystyna K. Matusiak Towards user-centered indexing in digital image collections , 2006, OCLC Syst. Serv..

[24]  Wei-Ying Ma,et al.  Image annotation by large-scale content-based image retrieval , 2006, MM '06.

[25]  James Ze Wang,et al.  Tagging over time: real-world image annotation by lightweight meta-learning , 2007, ACM Multimedia.

[26]  James Ze Wang,et al.  Toward Bridging the Annotation-Retrieval Gap in Image Search , 2007, IEEE MultiMedia.

[27]  Jean-Marc Odobez,et al.  A Thousand Words in a Scene , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Rong Yan,et al.  Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News , 2007, IEEE Transactions on Multimedia.

[29]  Rainer Lienhart,et al.  Image retrieval on large-scale image databases , 2007, CIVR '07.

[30]  Pablo Rodriguez,et al.  I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system , 2007, IMC '07.

[31]  Rong Yan,et al.  Semantic concept-based query expansion and re-ranking for multimedia retrieval , 2007, ACM Multimedia.

[32]  David A. Shamma,et al.  Watch what I watch: using community activity to understand content , 2007, MIR '07.

[33]  Marcel Worring,et al.  Adding Semantics to Detectors for Video Retrieval , 2007, IEEE Transactions on Multimedia.

[34]  Mor Naaman,et al.  How flickr helps us make sense of the world: context and content in community-contributed media collections , 2007, ACM Multimedia.

[35]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[38]  Nenghai Yu,et al.  Flickr distance , 2008, ACM Multimedia.

[39]  Marcel Worring,et al.  Learning tag relevance by neighbor voting for social image retrieval , 2008, MIR '08.

[40]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[42]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[43]  Shumeet Baluja,et al.  VisualRank: Applying PageRank to Large-Scale Image Search , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Changhu Wang,et al.  Scalable search-based image annotation , 2008, Multimedia Systems.

[45]  Wei-Ying Ma,et al.  Annotating Images by Mining Image Search Results , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Kilian Q. Weinberger,et al.  Resolving tag ambiguity , 2008, ACM Multimedia.

[47]  Marcel Worring,et al.  Annotating images by harnessing worldwide user-tagged photos , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.