Usage Based Tag Enhancement of Images

Appropriate tagging of images is at the heart of efficient recommendation and retrieval and is used for indexing image content. Existing technologies in image tagging either focus on what the image contains based on a visual analysis or utilize the tags from the textual content accompanying the images as the image tags. While the former is insufficient to get a complete understanding of how the image is perceived and used in various context, the latter results in a lot of irrelevant tags particularly when the accompanying text is large. To address this issue, we propose an algorithm based on graph-based random walk that extracts only image-relevant tags from the accompanying text. We perform detailed evaluation of our scheme by checking its viability using human annotators as well as by comparing with state-of-the art algorithms. Experimental results show that the proposed algorithm outperforms base-line algorithms with respect to different metrics.

[1]  Alberto Del Bimbo,et al.  Image Tag Assignment, Refinement and Retrieval , 2015, ACM Multimedia.

[2]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[3]  Dafna Shahaf,et al.  Connecting the dots between news articles , 2010, IJCAI.

[4]  Gerhard Weikum,et al.  As Time Goes By: Comprehensive Tagging of Textual Phrases with Temporal Scopes , 2016, WWW.

[5]  Purnamrita Sarkar,et al.  Random Walks in Social Networks and their Applications: A Survey , 2011, Social Network Data Analytics.

[6]  R. Sokal,et al.  THE COMPARISON OF DENDROGRAMS BY OBJECTIVE METHODS , 1962 .

[7]  Jane Yung-jen Hsu,et al.  A Content-Based Method to Enhance Tag Recommendation , 2009, IJCAI.

[8]  Yiming Yang,et al.  Improving text categorization methods for event tracking , 2000, SIGIR '00.

[9]  Samy Bengio,et al.  Learning semantic relationships for better action retrieval in images , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Gerhard Weikum,et al.  Robust Disambiguation of Named Entities in Text , 2011, EMNLP.

[11]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[12]  Ramesh Nallapati,et al.  Event threading within news topics , 2004, CIKM '04.

[13]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[14]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Rada Mihalcea,et al.  Text Mining for Automatic Image Tagging , 2010, COLING.

[16]  Gerhard Weikum,et al.  Knowlywood: Mining Activity Knowledge From Hollywood Narratives , 2015, CIKM.

[17]  José M. F. Moura,et al.  VisualWord2Vec (Vis-W2V): Learning Visually Grounded Word Embeddings Using Abstract Scenes , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[19]  Lexing Xie,et al.  Picture tags and world knowledge: learning tag relations from visual semantic sources , 2013, ACM Multimedia.

[20]  Hanan Samet,et al.  Adaptive context features for toponym resolution in streaming news , 2012, SIGIR '12.

[21]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .