Tag Tagging: Towards More Descriptive Keywords of Image Content

Tags have been demonstrated to be effective and efficient for organizing and searching social image content. However, these human-provided keywords are far from a comprehensive description of the image content, which limits their effectiveness in tag-based image search. In this paper, we propose an automatic scheme called tag tagging to supplement semantic image descriptions by associating a group of property tags with each existing tag. For example, an initial tag “tiger” may be further tagged with “white”, “stripes”, and “bottom-right” along three tag properties: color, texture, and location, respectively. In this way, the descriptive ability of the existing tags can be greatly enhanced. In the proposed scheme, a lazy learning approach is first applied to estimate the corresponding image regions of each initial tag, and then a set of property tags that correspond to six properties, including location, color, texture, size, shape, and dominance, are derived for each initial tag. These tag properties enable much more precise image search especially when certain tag properties are included in the query. The results of the empirical evaluation show that tag properties remarkably boost the performance of social image retrieval.

[1]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[2]  Stephen E. Robertson,et al.  Okapi at TREC-4 , 1995, TREC.

[3]  Amanda Spink,et al.  How are we searching the World Wide Web? A comparison of nine search engine transaction logs , 2006, Inf. Process. Manag..

[4]  Abebe Rorissa,et al.  User-generated descriptions of individual images versus labels of groups of images: A comparison using basic level theory , 2008, Inf. Process. Manag..

[5]  Alexander G. Hauptmann,et al.  The Use and Utility of High-Level Semantic Features in Video Retrieval , 2005, CIVR.

[6]  Laura A. Dabbish,et al.  Labeling images with a computer game , 2004, AAAI Spring Symposium: Knowledge Collection from Volunteer Contributors.

[7]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[8]  Shih-Fu Chang,et al.  Automated binary texture feature sets for image retrieval , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[9]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  David A. Forsyth,et al.  Utility data annotation with Amazon Mechanical Turk , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[11]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[12]  Matti Pietikäinen,et al.  Gray Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2000, ECCV.

[13]  Andrew Zisserman,et al.  Learning Visual Attributes , 2007, NIPS.

[14]  Mor Naaman,et al.  Why we tag: motivations for annotation in mobile and online media , 2007, CHI.

[15]  Wayne D. Gray,et al.  Basic objects in natural categories , 1976, Cognitive Psychology.

[16]  Meng Wang,et al.  Tagging tags , 2010, ACM Multimedia.

[17]  Xian-Sheng Hua,et al.  Online multi-label active annotation: towards large-scale content-based video search , 2008, ACM Multimedia.

[18]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[19]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[20]  Xian-Sheng Hua,et al.  Color-Structured Image Search , 2009 .

[21]  Gang Wang,et al.  Joint learning of visual attributes, object classes and visual saliency , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Kilian Q. Weinberger,et al.  Resolving tag ambiguity , 2008, ACM Multimedia.

[23]  Amanda Spink,et al.  Image searching on the Excite Web search engine , 2001, Inf. Process. Manag..

[24]  K. Gegenfurtner,et al.  Memory modulates color appearance , 2006, Nature Neuroscience.

[25]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[26]  Cordelia Schmid,et al.  Learning Color Names from Real-World Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  E. Thompson Colour Vision: A Study in Cognitive Science and Philosophy of Science , 1994 .

[29]  Xian-Sheng Hua,et al.  Towards a Relevant and Diverse Search of Social Images , 2010, IEEE Transactions on Multimedia.

[30]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[31]  Ramesh C. Jain,et al.  Similarity indexing: algorithms and performance , 1996, Electronic Imaging.

[32]  Isabella Peters,et al.  Folksonomies - Indexing and Retrieval in Web 2.0 , 2009, Knowledge and Information.

[33]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[34]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[35]  Marcel Worring,et al.  Annotating images by harnessing worldwide user-tagged photos , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[36]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Larry S. Davis,et al.  Beyond Nouns: Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers , 2008, ECCV.

[38]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Meng Wang,et al.  Social Image Search with Diverse Relevance Ranking , 2010, MMM.

[40]  Meng Wang,et al.  Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation , 2009, IEEE Transactions on Multimedia.

[41]  Hai Jin,et al.  Label to region by bi-layer sparsity priors , 2009, MM '09.

[42]  Qi Zhang,et al.  EM-DD: An Improved Multiple-Instance Learning Technique , 2001, NIPS.

[43]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[44]  Meng Wang,et al.  Unified Video Annotation via Multigraph Learning , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[45]  Touradj Ebrahimi,et al.  Object-based tag propagation for semi-automatic annotation of images , 2010, MIR '10.