Image tag clarity: in search of visual-representative tags for social images

Tags associated with images in various social media sharing web sites are valuable information source for superior image retrieval experiences. Due to the nature of tagging, many tags associated with images are not visually descriptive. In this paper, we propose Normalized Image Tag Clarity (NITC) to evaluate the effectiveness of a tag in describing the visual content of its annotated images. It is measured by computing the zero-mean normalized distance between the tag language model estimated from the images annotated by the tag and the collection language model. The visual-representative tags that are commonly used to annotate visually similar images are given high tag clarity scores. Evaluated on a large real-world dataset containing more than 269K images and their associated tags, we show that NITC score can effectively identify the visual-representative tags from all tags contributed by users. We also demonstrate through experiments that most popular tags are indeed visually representative.

[1]  O. Bolotina,et al.  On stability of the , 2003 .

[2]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Nenghai Yu,et al.  Learning to tag , 2009, WWW '09.

[5]  Aixin Sun,et al.  On Stability, Clarity, and Co-occurrence of Self-Tagging , 2009, WSDM.

[6]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[7]  Nenghai Yu,et al.  Flickr distance , 2008, ACM Multimedia.

[8]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[9]  Marcel Worring,et al.  Learning tag relevance by neighbor voting for social image retrieval , 2008, MIR '08.

[10]  Susan T. Dumais,et al.  To personalize or not to personalize: modeling queries with variation in user intent , 2008, SIGIR '08.

[11]  Qi Tian,et al.  What are the high-level concepts with small semantic gaps? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Ricardo Baeza-Yates,et al.  Improved query difficulty prediction for the web , 2008, CIKM '08.

[13]  W. Bruce Croft,et al.  Query performance prediction in web search environments , 2007, SIGIR.

[14]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[15]  Kilian Q. Weinberger,et al.  Resolving tag ambiguity , 2008, ACM Multimedia.

[16]  Elad Yom-Tov,et al.  Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval , 2005, SIGIR '05.

[17]  Jaime G. Carbonell,et al.  Retrieval and feedback models for blog feed search , 2008, SIGIR '08.