Image retagging

Online social media repositories such as Flickr and Zooomr allow users to manually annotate their images with freely-chosen tags, which are then used as indexing keywords to facilitate image search and other applications. However, these tags are frequently imprecise and incomplete, though they are provided by human beings, and many of them are almost only meaningful for the image owners (such as the name of a dog). Thus there is still a gap between these tags and the actual content of the images, and this significantly limits tag-based applications, such as search and browsing. To tackle this issue, this paper proposes a social image "retagging" scheme that aims at assigning images with better content descriptors. The refining process, including denoising and enriching, is formulated as an optimization framework based on the consistency between "visual similarity" and "semantic similarity" in social images, that is, the visually similar images tend to have similar semantic descriptors, and vice versa. An effective iterative bound optimization algorithm is applied to learn the improved tag assignment. In addition, as many tags are intrinsically not closely-related to the visual content of the images, we employ knowledge based method to differentiate visual content related tags from unrelated ones and then constrain the tagging vocabulary of our automatic algorithm within the content related tags. Finally, to improve the coverage of the tags, we further enrich the tag set with appropriate synonyms and hypernyms based on an external knowledge base. Experimental results on a Flickr image collection demonstrate the effectiveness of this approach. We will also show the remarkable performance improvements brought by retagging via two applications, i.e., tag-based search and automatic annotation.

[1]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[2]  Mor Naaman,et al.  Why we tag: motivations for annotation in mobile and online media , 2007, CHI.

[3]  Krystyna K. Matusiak Towards user-centered indexing in digital image collections , 2006, OCLC Syst. Serv..

[4]  Yi Liu,et al.  Semi-supervised Multi-label Learning by Constrained Non-negative Matrix Factorization , 2006, AAAI.

[5]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[6]  Shih-Fu Chang,et al.  To search or to label?: predicting the performance of search-based automatic image classifiers , 2006, MIR '06.

[7]  P. Anderson What is Web 2.0? Ideas, technologies and implications for education , 2007 .

[8]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[9]  Xian-Sheng Hua,et al.  Online multi-label active annotation: towards large-scale content-based video search , 2008, ACM Multimedia.

[10]  Kilian Q. Weinberger,et al.  Resolving tag ambiguity , 2008, ACM Multimedia.

[11]  Dong Liu,et al.  Tag quality improvement for social images , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[12]  Brian M. Dennis,et al.  Foragr: Collaboratively Tagged Photographs and Social Information Visualization , 2006 .

[13]  Dekang Lin,et al.  Using Syntactic Dependency as Local Context to Resolve Word Sense Ambiguity , 1997, ACL.

[14]  Dong Liu,et al.  Retagging social images based on visual and semantic consistency , 2010, WWW '10.

[15]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[16]  Keiji Yanai,et al.  Image region entropy: a measure of "visualness" of web images associated with one concept , 2005, MULTIMEDIA '05.

[17]  Dong Liu,et al.  Boost search relevance for tag-based social image retrieval , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[18]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[19]  Roelof van Zwol,et al.  Classifying tags using open content resources , 2009, WSDM '09.

[20]  Latifur Khan,et al.  Image annotations by combining multiple evidence & wordNet , 2005, ACM Multimedia.

[21]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[22]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[23]  Qi Tian,et al.  What are the high-level concepts with small semantic gaps? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Rong Yan,et al.  A learning-based hybrid tagging and browsing approach for efficient manual image annotation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Changhu Wang,et al.  Content-Based Image Annotation Refinement , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Meng Wang,et al.  Visual query suggestion , 2009, ACM Multimedia.

[27]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[28]  Chong-Wah Ngo,et al.  Columbia University/VIREO-CityU/IRIT TRECVID2008 High-Level Feature Extraction and Interactive Video Search , 2008, TRECVID.

[29]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.