Towards data-driven estimation of image tag relevance using visually similar and dissimilar folksonomy images

Given that the presence of non-relevant tags in an image folksonomy hampers the effective organization and retrieval of images, this paper discusses a novel technique for estimating the relevance of user-supplied tags with respect to the content of a seed image. Specifically, this paper proposes to compute the relevance of image tags by making use of both visually similar and dissimilar images. That way, compared to tag relevance estimation only using visually similar images, the difference in tag relevance between tags relevant and tags irrelevant with respect to the content of a seed image can be increased at a limited increase in computational cost, thus making it more straightforward to distinguish between them. The latter is confirmed through experimentation with subsets of MIRFLICKR-25000 and MIRFLICKR-1M, showing that tag relevance estimation using both visually similar and dissimilar images allows achieving more effective image tag refinement and tag-based image retrieval than tag relevance estimation only using visually similar images.

[1]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[2]  Stefanie N. Lindstaedt,et al.  Automatic image annotation using visual content and folksonomies , 2009, Multimedia Tools and Applications.

[3]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[4]  Marcel Worring,et al.  Social negative bootstrapping for visual categorization , 2011, ICMR '11.

[5]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Sourav S. Bhowmick,et al.  Content is still king: the effect of neighbor voting schemes on tag relevance for social image retrieval , 2012, ICMR.

[7]  Wesley De Neve,et al.  MAP-based image tag recommendation using a visual folksonomy , 2010, Pattern Recognit. Lett..

[8]  James Ze Wang,et al.  Automatic image semantic interpretation using social action and tagging data , 2010, Multimedia Tools and Applications.

[9]  Kilian Q. Weinberger,et al.  Reliable tags using image similarity: mining specificity and expertise from large-scale multimedia databases , 2009, WSMC '09.

[10]  Dong Liu,et al.  Content-based tag processing for Internet social images , 2010, Multimedia Tools and Applications.

[11]  Latifur Khan,et al.  Image annotations by combining multiple evidence & wordNet , 2005, ACM Multimedia.

[12]  Wesley De Neve,et al.  Near-Duplicate Video Clip Detection Using Model-Free Semantic Concept Detection and Adaptive Semantic Distance Measurement , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Thomas Deselaers,et al.  Visual and semantic similarity in ImageNet , 2011, CVPR 2011.

[14]  Bart Thomee,et al.  New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative , 2010, MIR '10.

[15]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[16]  Vanessa Murdock,et al.  Your mileage may vary: on the limits of social media , 2011, SIGSPACIAL.

[17]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[18]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[19]  Wesley De Neve,et al.  Tag refinement in an image folksonomy using visual similarity and tag co-occurrence statistics , 2010, Signal Process. Image Commun..